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ABSTRACT 


This  paper  is  concerned  with  the  evaluation  of  algorithms 
used  by  passive  infrared  sensors  to  discriminate  between  signals 
due  to  target  sources  and  those  due  to  background  clutter.  The 
discussion  is  essentially  restricted  to  the  case  of  point  tar¬ 
gets. 

The  goal  is  to  obtain  a  rough  estimate  of  performance 
against  minimum  standards.  For  this  purpose  the  analysis  as¬ 
sumes  a  simple  mathematical  model  for  the  background  clutter 
distribution:  namely,  that  it  is  multivariate  Gaussian  over 

the  spatial  and  spectral  data  channels  provided  by  the  sensor. 
The  paper  also  discusses  experimental  evidence  for  and  against 
such  a  model,  as  well  as  certain  more  explicit  statistical 
models  that  have  been  proposed  for  the  spatial  distribution  of 
clutter. 

Other  topics  discussed  are  CFAR  optimum  processing,  linear 
filters,  the  effect  of  using  ratios  of  spectral  components  for 
processing  in  multi-color  systems  rather  than  the  components, 
themselves,  and  background  normalization.  Also  discussed  is  the 
relationship  between  the  effectiveness  of  tracking  algorithms 
and  the  preliminary  screening  of  targets  by  CFAR  detection 
algorithms . 
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EXECUTIVE  SUMMARY- 


Separatlng  targets  from  their  backgrounds  is  a  signal  pro¬ 
cessing  problem  that  is  a  major  concern  to  infrared  sensors. 

This  paper  reviews  several  of  the  approaches  that  are  now  under 
serious  consideration  for  use  by  infrared  surveillance  systems 
to  deal  with  the  problem — particularly  for  the  case  of  point 
targets. 

In  the  course  of  the  review  certain  observations  and  con¬ 
clusions  scattered  throughout  the  text  may  have  more  value  for 
those  who  have  a  general  interest  In  evaluating  alternative 
approaches  to  the  Infrared  target  discrimination  problem  than 
other  parts  of  the  text.  The  parts  that  contain  the  supporting 
analysis  must,  of  necessity,  be  somewhat  drawn  out  and  mathe¬ 
matically  formal  in  order  to  provide  the  rigor  needed  to  make  a 
hard  comparison  between  methods,  or  to  disprove  a  common  assump¬ 
tion..  Thus  some,  perhaps  most,  of  the  material  in  this  paper 
consists  of  technical  detail  that,  undoubtedly,  will  be  largely 
Ignored  by  many  readers  whose  interest  in  signal  processing  theory 
is  only  peripheral. 

Therefore,  the  following  summary  is  presented  in  an  attempt 
to  gather  together  the  essence  of  this  paper  in  the  hope  that 
it  may,  thereby,  be  rendered  more  accessible  to  the  reader  whose 
interests  are  less  specialized.  Each  item  is  headlined  and 
annotated  for  easy  reference  to  the  pertinent  analysis  or  dis¬ 
cussion  contained  in  the  main  body  of  the  paper. 

EXPERIMENTAL  SUPPORT  FOR  STATISTICAL  MODELS  OF  TERRAIN  BACKGROUND 

Empirical  evidence  Indicates  that  infrared  radiance  from 
natural  terrain,  such  as  a  forest  or  a  desert,  is,  to  a  good 


approximation,  normally  distributed  for  a  variety  of  wavelength 
bands  in  both  the  solar  and  thermal  regions  of  the  spectrum. 

This  is  less  true  of  scenes  that  have  been  affected  in  some  way 
by  protracted  human  intervention,  e.g.,  farm  land,  proving 
grounds,  large  cities.  In  general,  the  approximation  is  better 
at  night  than  during  the  day. 

On  the  other  hand,  empirical  evidence  does  not  support 
certain  theoretical  models  that  have  been  proposed  for  the 
statistical  spatial  distribution  of  terrain  background  radiance. 
Specifically,  the  data  are  inconsistent  with  the  so-called  two- 
dimensional  Markoff  process  distributions  that  are  characterized 
by  exponential  correlation  functions.  In  fact,  some  versions 
of  this  type  of  model  are  hot  even  theoretically  self-consistent 
(More  detailed  discussions  of 'these  matters  and  supporting 
analyses  appear  in  Chapter  II,  Section  D.) 

SUB-OPTIMAL  NATURE  OF  LINEAR  FILTERS 

For  Constant  False  Alarm  Rate  (CFAR)  detection  of  point 
targets,  linear  filters  are  sub-optimal  in  general.  The  linear 
filter  that,  in  theory,  maximizes  the  signal-to-noise  ratio  for 
a  background  whose  spatial  distribution  is  statistically  homo¬ 
geneous  is  a  limiting  case  that  the  true,  nonlinear,  optimal 
filter  would  approach  if  the  temperature  of  the  target  were  very 
large  compared  to  that  of  the  background  and  the  size  of  the 
target  were  small  compared  to  the  Instantaneous  Field  Of  View 
(IFOV)  of  a  single  infrared  detector.  (The  analysis  supporting 
these  conclusions  appears  in  Chapter  III,  Section  B. ) 

THE  VALUE  AND  LIMITATIONS  OF  TRACKING  ALGORITHMS 

•  Target  discrimination  algorithms  are  of  two  types:  those 
whose  purpose  i3  clutter  rejection  and  those,  referred  to  as 
tracking  algorithms,  that  distinguish  targets  by  their  charac¬ 
teristic  trajectories.  Most  infrared  systems  use  both  types. 
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Tracking  algorithms,  which  are  a  form  of  Moving  Target 
Indication  (MTI)  technique  are,  in  principle,  the  only  hope  for 
achieving  the  very  low  false  alarm  rates  that  are  typically  re¬ 
quired  for  the  detection  of  point  targets  by  Infrared  Search 
and  Tracking  CIRST)  system  specifications.  Nevertheless,  it 
is  also  necessary  for  this  purpose  to  provide  preliminary  clut¬ 
ter  rejection  means,  such  as  spatial  filtering  and  adaptive 
thresholding  in  one  or  more  spectral  channels,  to  reduce  the 
number  of  false  detections  before  invoking  tracking  algorithms. 

System  designers  often  attribute  the  reason  for  requiring 
a  preliminary  clutter  rejection  process  to  limitations  the 
available  computer  capacity,  i.e.,  memory  size  and  comp  er 
speed.  This  would  seem  to  imply  that  technological  adv  ~.es, 
e.g.,  the  introduction  of  VHSIC  and  VLSIC,  will  eventua 
make  such  a  procedure  unnecessary. 

However,  the  requirement* is  actually  independent  of  com¬ 
puter  capacity.  That  is,  tracking  algorithms  will  work  only 
if,  initially,  the  expected  number  of  false  detections  is  below 
a  certain  critical  value.  Moreover,  the  effectiveness  of  a 
tracking  algorithm  is  extremely  sensitive  to  errors  unless  the 
a  priori  false  detection  probability  can  be  made  small  by  those 
other,  preliminary,  signal  processing  techniques.  (The  analysis 
supporting  these  conclusions  appears  In  Chapter  III,  Section  C.) 

THEORETICAL  IMPLICATIONS  AND  POSSIBLE  IMPROVEMENT 
OF  BACKGROUND  NORMALIZATION 

A  common  form  of  adaptive  thresholding,  sometimes  known 
as  "background  normalization,"  which  is  an  averaging  process 
implemented  with  a  two-dimensional  linear  filter,  is  equivalent 
to  a  least -square-error  fit  of  a  linear  function  to  data  ob¬ 
tained  from  measurements  of  the  background  radiance  distribution. 
It  follows  that  the  next  order  improvement  would  be  a  quadratic 


least-square-error  fit.  The  quadratic  fit  can  also  be  accom¬ 
plished  by  means  of  an  averaging  (weighted  in  this  case)  process 
that  is  implemented  with  a  two-dimensional  linear  filter. 

(The  derivation  of  these  results  appears  in  Chapter  IV,  Section  B.) 

THE  FALSE  ALARM  PENALTY  IMPOSED  BY  THE  USE 
OF  SPECTRAL  COMPONENT  RATIOS 

For  multi-color  or  spectral  discrimination  systems  it  is 
sometimes  the  practice  to  work  with  ratios  of  spectral  components 
rather  than  the  components  themselves.  For  example,  a  two- 
color  system  with  radiance  measurements  and  in  the  two 
spectral  bands  would  use  a  one-dimensional  target  discrimination 
algorithm  operating  on  the  ratio  J^/J2  rat^er  than  a  two-dimen¬ 
sional  algorithm  operating  on  the  pair  J.,  J2>  This  usually 
results  in  significantly  higher  false  alarm  rates,  sometimes 
by  several  orders  of  magnitude,  than  would  be  generated  by  the 
equivalent  two-dimensional  process.  (The  proof  of  these  con¬ 
clusions  appears  in  Chapter  IV,  Section  C.) 


I.  INTRODUCTION 


This  paper  is  concerned  with  algorithms  used  by  passive 
infrared  (IR)  sensors  to  discriminate  between  signals  due  to 
target  sources  and  those  due  to  background  clutter.  The  pri¬ 
mary  objective  is  to  formulate  a  simple  methodology  for  evalu¬ 
ating  such  algorithms. 

The  goal  has  been  to  develop  an  evaluation  procedure  that 
is  easier  to  implement  and  is  less  specific  than  a  detailed 
computer  simulation,  which  is  the  usual  approach  to  this  ob¬ 
jective.  The  purpose  here  is  not  to  supplant  computer  simula¬ 
tion  as  a  means  of  evaluating-  a  signal  processor's  logic  design. 
Rather,  it  is  to  provide  an  analytical  tool  that  can  be  used  for 
a  rough,  preliminary  assessment  of  the  feasibility  or  the  poten¬ 
tial  of  different  processing  schemes. 

The  scope  of  this  paper  is  essentially  restricted  to  the 
case  of  non-imaging  systems,  i.e.,  those,  such  as  the  infrared 
search  and  tracking  system  (IRST),  for  which  targets  behave  as 
point  sources  under  ordinary  operating  conditions.*  The  point 
target  assumption  implies  that  discrimination  algorithms  must 
be  of  an  abstract  nature,  relying  upon  certain  target  and  back¬ 
ground  signatures  that  are  not  associated  with  easily  identified 
geometric  attributes,  such  as  size  and  shape,  that  would  be 
available  to  an  imaging  system.  However,  signatures  may  be 
derived  from  any  combination  of  spectral  and  temporal,  as  well 
as  certain  limited  spatial,  properties  of  targets  and  backgrounds. 


The  term  "non  imaging"  seems  appropriate  in  the  present  context 
even  if  the  system  in  question  is  capable  of  providing  an  image 
of  the  background  (although  not  of  the  target)  as  long  as  it 
does  not,  in  fact,  make  use  of  such  an  image  for  target  dis¬ 
crimination. 
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The  proposed  algorithm  evaluation  methodology  depends  upon 
a  mathematical  model  that  is  based  on  the  possibility  of  pre¬ 
dicting  statistically  the  distribution  of  measured  data  over 
some  number  of  channels.  Every  IR  sensor  system  defines  these 
channels  in  a  natural  way,  according  to  the  discriminants  that 
it  is  designed  to  use.  Each  pixel  in  the  spatial  distribution 
of  an  observed  scene,  the  observed  signal  from  each  spectrally 
resolved  wavelength  band,  and  each  time  frame  in  the  temporal 
sequence  of  observations  constitutes  a  separate  data  channel 
in  this  sense. 

The  mathematical  model  assumes  a  signal  processing  logic 
that  divides  the  decision  process  for  discriminating  between 
targets  and  background  into  two  steps.  The  first  is  the  detec¬ 
tion  phase,  which  eliminates  as  many  false  alarms  as  possible 
by  means  of  one  or  more  preliminary  target  detection  algorithms 
The  second  is  the  declaration' phase  which  generates  the  final 
decision  as  to  the  presence  or  absence  of  a  target  in  a  given 
direction. 

A  preliminary  detection  algorithm,  used  in  the  first  phase 
is  a  linear  or  nonlinear  digital  filtering  operation  followed 
by  thresholding.  Tracking  algorithms,  which  distinguish  be¬ 
tween  the  resulting  target  and  clutter  detections  by  means  of 
their  supposedly  different  trajectory  characteristics  observed 
over  time,  provide  the  final,  second  phase,  decision  whether 
or  not  to  declare  that  a  target  is  present. 

For  the  preliminary  detection  phase  the  mathematical  model 
assumes  that  the  statistical  distribution  of  IR  radiance  over 
the  data  channels  is  adequately  approximated  by  an  N-variate 
Gaussian  probability  distribution.*  There  are  several  argu¬ 
ments  to  Justify  this  assumption. 

* - 

This  is  a  generalization  of  a  similar  model  proposed  in  Ref.  2 
for  spectral  discrimination. 
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First,  experimental  evidence  suggests  (Refs.  8  and  9)  that 
for  a  variety  of.  terrain  backgrounds,*  although  by  no  means  all, 
in  selected  spectral  channels  distributed  over  a  band  between 
2  u  and  11.4  yi  Gaussian  distributions  fit  measurement  data 
remarkably  well.  This  is  true  for  data  taken  over  background 
regions  that  comprise  as  many  as  two-hundred-thousand  pixels. 

Second,  although,  as  R.  A.  Steinberg  has  pointed  out,  the 
mean  background  radiance  can  be  expected  to  vary  over  space  and 
time,  the  variation  is  usually  gradual  except  for  cases  in 
which  glint  dominates.**  Thus,  the  assumed  N-variate  Gaussian 
distribution  can  be  regarded  as  a  local  approximation  to  the 
actual  N-channel  background  distribution,  valid  to  the  second 
order  in  terms  of  moments  of  the  corresponding  density  functions. 

It  is  sometimes  argued  that,  although  a  distribution  may 
be  approximately  Gaussian  out  to  2  or  3  o,  acceptable  IR  sensor 
system  false  alarm  rates  in  practice  are  so  low  that  the  tail 
of  the  distribution  is  also  significant.  This  would  be  true  if 
an  attempt  were  made  to  meet  the  false  alarm  specification  with 
preliminary  detection  algorithms  alone.***  However,  in  most 
cases  those  algorithms  are  used  primarily  to  thin  out  the  false 

* 

Unfortunately ,  the  argument  is  limited  In  scope  by  the  fact 
that  similar  data  for  cloud  backgrounds  does  not  exist  in 
the  literature  at  the  present  time. 

In  Ref.  18,  Steinberg,  taking  into  account  photon  fluctu¬ 
ations,  analyzes  the  design  of  optimum  filters  for  threshold¬ 
ing  against  different  spatial  variations  of  a  background. 

His  design  concepts,  as  well  as  other  adaptive  thresholding 
technia.ues,  some  of  which  have  been  implemented  in  existing 
•  IR  systems,  depend  by  implication  on  the  assumption  that  the 
background  variation  will  be  gradual  for  the  most  part. 

Reference  2,  in  fact,  proposes  a  12-color  spectral  detection 
processing  scheme  that  would  do  Just  that  if  the  target  and 
background  distributions  happen  to  fit  certain  models  that 
the  authors  of  the  report  have  generated  synthetically  and 
which  assume  N-variate  Gaussian  distributions  over  the  12 
channels . 
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alarms  during  the  preliminary  detection  phase,  and  the  respon¬ 
sibility  for  the  final  target  declaration  Is  reserved  for  track¬ 
ing  algorithms.  The  burden  of  satisfying  the  false  alarm  rate 
requirement  then  rests  ultimately  on  the  tracking  algorithms. 

Perhaps  the  most  important  argument  for  assuming  Gaussian 
distributions,  however,  is  that  they  furnish  a  minimum  standard 
of  acceptance.  That  is,  a  signal  processing  scheme  ought  to  be 
regarded  as  unacceptable  if  it  does  not  perform  well  against  a 
Gaussian  distributed  background.  Of  course,  the  converse 
statement  is  false;  therefore,  even  if  the  scheme  does  meet 
the  standard  there  may  still  be  cause  to  reject  it,  at  least 
for  some  applications. 

In  this  connection,  it  should  be  noted  that  it  is  possible 
to  include  in  an  evaluation  based  on  such  a  minimal  acceptance 
standard  the  effect  of  different  scenarios  which  may  imply  not 
only  a  change  in  the  background,  e.g.,  from  sky  to  terrain,  but 
changes  In  other  environmental  factors  as  well.  For  example. 

Ref.  4,  using  calculations  obtained  from  a  computer  program 
(5  cm-1  L0WTRAN5 )  for  estimating  propagation  effects,  discusses 
the  influence  that  range  and  the  altitudes  of  both  the  target 
and  the  sensor  platform  may  have  on  spectral  discriminants. 

This  influence.  It  is  pointed  out,  would  necessarily  be  reflected 
in  the  evaluation  of  a  target  detection  algorithm,  particularly 
one  that  relied  upon  data  from  multiple  spectral  channels. 

Chapter  II  of  this  paper  describes  in  more  detail  the  pro¬ 
posed  mathematical  model  for  evaluating  discrimination  algo¬ 
rithms,  as  well  as  some  of  the  model's  ramifications  when 
applied  to  spatial  discriminants  in  particular.  The  discussion 
in  Chapter  II  covers  the  explicit  form  of  the  model  for  both 
spatial  and  spectral  channels  when  targets  are  present  or  ab¬ 
sent.  It  also  Indicates  how  the  extensive  measurement  data 
presented  in  Ref.  9  can  be  used  to  test  the  validity  of  a  class 
of  occasionally  encountered  hypotheses  concerning  the  nature 
of  background  spatial  distributions  for  natural  scenes. 
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Chapter  III  deals  with  optimum  constant  false  alarm  rate 
(CFAR)  detection.  It  also  considers  the  relationship  between 
the  effectiveness  of  preliminary  CFAR  detection  algorithms  and 
the  effectiveness  of  tracking  algorithms  used  for  the  final 
target  declaration. 

The  general  optimum  CFAR  processing  rule  presented  in 
Chapter  III  is  essentially  that  found  in  Ref.  19,  which,  how¬ 
ever,  refers  to  an  earlier  reference  for  its  derivation.  For 
the  sake  of  completeness  an  independent  derivation  of  the  rule 
is  given  in  Appendix  A.* 

Chapter  IV  analyzes  two  signal  processing  techniques  that 
are  sometimes  encountered  in  IR  processor  system  designs.  One 
is  a  method  for  adaptive  thresholding  against  spatially  varying 
backgrounds;  the  other  is  a  device  to  reduce  the  number  of 
degrees  of  freedom  to  be  considered  in  spectral  discrimination. 

This  paper  does  not  include  numerical  applications  to 
specific  cases,  except  for  one  or  two  examples  provided  to 
illustrate  a  point.  However,  the  analysis  that  is  applied  to 
developing  the  methodology  for  evaluating  discrimination  algo¬ 
rithms  leads  naturally  to  some  conclusions  of  a  general  nature 
which  are  noted  in  the  text  as  they  occur.'  These  conclusions 
also  appear  in  Chapter  V,  along  with  a  summary  of  the  principle 
ideas  introduced  in  the  earlier  chapters. 


This  might  also  have  been  done  for  a  fundamental  theorem,  in¬ 
troduced  in  Chapter  II,  concerning  the  probability  distribu¬ 
tion  that  results  from  a  linear  transformation  of  variables 
having  an  N-variate  Gaussian  distribution.  The  theorem,  how¬ 
ever,  is  reasonably  well-known  and  is  heuristically  evident. 
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II.  MATHEMATICAL  MODELS 


A.  DATA  CHANNELS 

An  IR  sensor  with  multiple  detectors  provides  data  that 
are  separated  naturally  into  discrete  channels,  each  of  which 
is  associated  with  the  output  signal  from  one  of  the  detectors. 
For  signal  processing  purposes,  however,  it  is  useful  to  sepa¬ 
rate  the  data  into  channels  that  are  defined  in  terms  of  the 
discriminants  used  by  the  sensor  system  for  distinguishing  be¬ 
tween  target  and  clutter  sources. 

Multi-color  systems,  i.e.,  systems  that  rely  upon  spectral 
signatures  with  components  in  two  or  more  distinct  wavelength 
bands,  are  the  usual  examples  in  which  data  are  treated  from 
this  point  of  view.  However,  it  can  be  equally  useful  to  re¬ 
gard  data  as  separated  into  spatial  as  well  as  spectral  channels, 
a  point  of  view  which  this  paper  will  adopt  to  some  advantage, 
for  example,  in  discussing  the  effects  of  linear  spatial  filter¬ 
ing. 

The  individual  pixels  in  the  background  radiance  scene 

mapped  by  an  IR  sensor  will  determine  the  spatial  channels  as 

perceived  here.  Actually,  the  number  of  such  channels  will 

generally  be  limited  by  an  n  by  n  pixel  sliding  window.*  The 
2 

window  defines  n  spatial  channels,  one  for  each  pixel  contained 
within  it,  and  is  itself  defined  by  whatever  spatial  filtering 
algorithms  the  sensor  may  use  for  signal  processing. 


The ^window  could  Just  as  easily  be  rectangular.  The  implicit 
assumption  here  that  it  is  square  is  made  for  convenience,  to 
simplify  to  some  extent  the  algebraic  treatment  of  two-dimen¬ 
sional  arrays  of  channels. 
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It  is  convenient  to  require  that  n  be  an  odd  number  because 

the  pixel  at  the  center  of  the  array  will  have  a  special  role 

in  the  mathematical  model  to  be  proposed  here  for  characterizing 

spatial  discriminants.  Specifically,  if  a  target  signal  occurs 

2 

in  the  central  channel  the  data  in  the  full  complement  of  n 
spatial  channels  will  be  regarded  as  due  to  the  presence  of  a 
target.  Otherwise,  the  target  will  be  regarded  as  absent. 

It  is  assumed  that  the  detection  algorithm,  to  the  extent  that 
it  is  based  on  accurate  knowledge  of  the  target  and  clutter 
background  statistics,  is  deliberately  designed  to  announce 
that  a  detection  has  occurred  if,  and  only  if,  the  target  sig¬ 
nal  is  in  the  central  channel. 

This  convention  implies  a  desirable,  although  not  neces¬ 
sarily  achievable,  precision  in  the  location  of  a  target  by  the 
IR  system.  That  is,  as  the  array  window  scans  the  background 
a  true  detection  occurs  only  when  the  target  coincides  with  the 
central  pixel. 

B.  NOTATION  FOR  THE  SPATIAL  DISCRIMINANT  MODEL 

In  general,  data  divided  among  several  channels  will  be 
treated  as  a  vector  each  of  whose  components  is  the  signal 
strength  associated  with  one  of  the  channels.  Unfortunately, 
the  single  subscript  notation  ordinarily  used  in  dealing  with 

a  vector  V  in  terms  of  its  components  V.  conflicts  with  the 

***  1 

double  subscript  matrix  notation  that  is  more  natural  in  deal¬ 
ing  with  the  two-dimensional  array  of  signal  strengths 
associated  with  an  n  by  n  array  of  spatial  channels. 

Reference  1  (p.  128)  handles  this  problem  by  providing  a 
so-called  stacking  transformation  that  reorders  the  elements 
of  the  array  so  that  they  constitute  a  one-dimensional  sequence 
which  can  be  treated  as  the  components  of  a  vector  in  the  con¬ 
ventional  format.  Since  the  transformation  is  linear  and  inver¬ 
tible,  it  is  possible  to  apply  standard  algebraic  manipulations 
to  the  vector  and  change  back  to  the  two-dimensional  array  for¬ 
mat  whenever  it  is  convenient  to  do  so. 
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For  the  purpose  of  this  paper,  however,  the  stacking  trans¬ 
formation  seems  an  unnecessary  complication  that  would  obscure 
certain  geometric  patterns  or  effects  that  result,  for  example, 
when  two  or  more  linear  filtering  processes  are  combined.  In¬ 
stead,  a  two-component  vector  subscript  will  be  introduced  in 
place  of  the  pair  of  subscripts  ordinarily  used  to  designate  an 
array  element.  That  is,  becomes  S^,  where  the  subscript  k 
is  regarded  as  a  vector  with  the  components  i  and  j . 

In  this  notation  a  sum  over  k  will  mean  a  double  sum  taken 
independently  over  all  values  of  i  and  j.  Also,  the  usual 
conventions  that  apply  to  vectors  apply  to  vector  subscripts. 
Thus,  if  two  vector  subscripts  are  equal  it  will  mean  that  their 
corresponding  components  are  equal,  and  when  the  vector  sub¬ 
script  is  0  it  will  mean  that  both  subscript  components  are 
zero . 

It  is  then  possible  to  represent  the  linear  transformation 
of  an  array  In  the  usual  manner  as  a  multiplication  of  a  vector 

by  a  matrix.  That  is,  a  linear  transformation  from  the  array 

» 

with  elements  S4 <  to  one  with  elements  S  .  will  take  the  form 

<  -£  «ki  si  • 

where  k  and  1  both  represent  two  component  vector  subscripts. 

4 

The  matrix  with  elements  then  actually  has  n  elements, 
and  the  symbol  may  be  understood  to  have  four  scalar  sub¬ 
scripts. 

Sometimes  it  is  necessary  to  deal  with  array  vectors,  or 
transformations  of  the  type  just  described,  whose  algebraic 
representations  depend  In  some  explicit  way  on  their  subscripts. 
When  this  happens  it  is  usually  possible  to  express  such  quan¬ 
tities  in  dyadic  form,  so  that  despite  the  use  of  vector  sub¬ 
scripts  it  is  no  more  difficult  to  perform  explicit  algebraic 
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manipulations  with  them  than  would  be  the  case  if  their  sub¬ 
scripts  represented  ordinary  scalar  integers. 


In  order  to  emphasize  its  key  role,  the  central  channel  in 
an  n  by  n  array  will  be  designated  by  the  zero  vector  subscript, 
which  is  equivalent  to  two  zero  scalars.  Then,  with  scalar  sub¬ 
scripts  ordered  in  the  standard  manner,  with  the  conventional 
reference  to  an  array  element's  position  by  row  and  column, 
negative  subscripts  will  be  used  to  designate  elements  to  the 
left  of  or  abo've  the  center.  That  is,  for  an  element  S .  . ,  i 

T  n  n  1 

and  j  will  both  range  over  the  integers  from  to  .  For 
example,  in  the  case  n=3,  that  is,  for  a  3-by-3  or  9-element 
array,  the  array  would  have  the  form 

Ai-i.  s-io. 

I  s0-l  »  S00  » 

\sl-l  * Slc  • 

C.  PROBABILITY  DISTRIBUTIONS 

One  way  to  interpret  the  problem  of  detecting  the  presence 
of  a  target  against  a  clutter  background  is  to  regard  it  as  the 
problem  of  estimating  the  probability  that  the  target  is  present, 
given  the  information  acquired  from  the  data  provided  by  IR 
measurements.  On  the  basis  of  this  concept  Ref.  2  has  intro¬ 
duced  a  minimum  error  criterion*  for  multi-color  systems  to 
distinguish  between  targets  and  clutter  by  means  of  their  spec¬ 
tral  characteristics.** 

~I - 

A  minimum  error  criterion  in  this  context  is  one  that  classi¬ 
fies  each  signal  as  due  either  to  the  target  or  to  clutter 
alone  with  the  smallest  possible  probability  of  an  erroneous 
classification.  Cf.  Ref.  3»  p.  269ff. 

ft# 

See  also  the  discussion  in  Ref.  4,  Appendix  B. 
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In  many  applications,  especially  those  involving  point  tar¬ 
gets,  however,  the  false  alarm  rate  is  a  major  concern.  It  Is 
a  primary  objective  of  the  present  paper  to  formulate  a  method 
for  evaluating  target  detection  algorithms  when. this  is,  in  fact, 
the  case.  Accordingly,  a  related  but  slightly  different  approach 
will  be  taken  here.  The  concepts  underlying  this  approach  can 
be  summarized  as  follows. 

Por  an  N  channel  system  each  measurement  set  produces  an  N 
component  vector  which  may  be  thought  of  as  representing  a  point 
in  N  dimensional  data  space.  The  set  of  all  such  data  points 
that  might  be  produced  by  clutter  in  the  absence  of  a  target  has, 
at  least  conceptually,  an  N-variate  joint  probability  distribu¬ 
tion  defined  by  a  probability  density  function  P„  (£),  where  £ 
is  a  vector  having  components  J.,  ...,  JN  that  may,  individually, 
range  over  all  positive  and  negative  real  values.  Similarly, 
there  is  another  such  probability  distribution,  and  a  corre¬ 
sponding  density  function  PT  (J)  that  is  associated  with  the 
presence  of  a  target  source. 

Suppose  that  there  is  an  algorithm  whose  purpose  is  to  de¬ 
cide  whether  a  given  measurement  set,  i.e.,  data  point,  was 
produced  In  the  presence  or  absence  of  a  target.  The  algorithm 
then  has  the  effect  of  separating  all  of  data  space  into  two 
complementary  regions. 

One  of  the  regions  R  will  consist  of  all  points  designated 
by  the  algorithm  as  due  to  a  target.  The  other  will  consist  of 
all  points  designated  as  due  to  clutter  in  the  absence  of  a 
target. 

The  probability  of  false  alarm  (PFA)  for  any  measurement 
set  Is  then  equal  to  the  integral  of  Pc  (J)  over  R;  i.e., 

PFA  -  f  ?c  (£)  dJ1  ...d  JN 
R 
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Also,  the  probability  of  detecting  a  target  (PTD)  by  means  of 
the  algorithm  applied  to  a  single  measurement  set  is  equal  to 
the  integral  of  PT  (J)  over  R;  i.e., 

PTD  •  J PT  (J)  dJx  ...d  JN  .  (2) 

R 

Throughout  this  paper  it  will  be  assumed  that  P„  (J)  and 

L/  +» 

PT  (J)  are  both  N-variate  Gaussian  probability  density  func¬ 
tions.*  That  is,  each  will  have  the  form 


where  M  is  the  covariance  matrix  of  the  particular  distribution, 

Si  _ 

| M I  is  the  determinant  and  M  the  inverse  of  M,  J  is  the  mean 

555  5w  ~ 

vector  of  the  distribution,  and  the  superscript  t  denotes  the 
transpose.  In  (3)  ordinary  matrix  multiplication  is  implied, 
so  that  a  vector  without  a  superscript  is  to  be  regarded  as  a 
column  vector  while  one  that  has  the  superscript  t  is  to  be 
regarded  as  a  row  vector. 

The  covariance  matrix  and  mean  vector  are  the  parameters 
that  specify  a  particular  N-variate  Gaussian  distribution. 
Therefore,  when  specific  reference  is  made  to  Pc  or  to  PT  in 
the  expression  for  the  density  given  by  (3)  M  and  J  will  bear 

a!  ^ 

the  appropriate  subscript,  C  or  T. 

There  are  several  arguments  in  favor  of  assuming  Gaussian 
probability  distributions  for  the  measured  signal  strengths. 
Since  a  major  objective  of  this  paper  is  to  devise  a  method 
for  testing  clutter  rejection  algorithms  analytically,  the  prin¬ 
cipal  argument  is  that  the  very  least  one  might  expect  from  such 
an  algorithm  would  be  satisfactory  performance  when  the  target 

and  clutter  signal  strengths  are  Gaussian  distributed. 

1 - 

For  properties  cf  N-variate  Gaussian  probability  distributions 
see,  for  example,  Ref.  5,  Chs.  21-24  or  Ref.  6. 
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There  are  certainly  a  number  of  environments  for  which 
existing  empirical  data  suggest  that  the  assumption  of  Gaussian 
statistics  may  be  surprisingly  accurate.  Examples  occur  in  the 
data  considered  by  Ref.  7— -notably,  that  taken  from  Ref.  8  and 
particularly  that  from  Ref.  9»  which  will  be  discussed  in  the 
next  section. 

0.  EXPERIMENTAL  DATA  AND  EXISTING  STATISTICAL  MODELS 

The  amount  of  data  collected  through  IR  measurement  over 
the  years  is  voluminous.  Measurement  programs  for  this  purpose 
have  covered  a  variety  of  targets  and  clutter  backgrounds  in 
virtually  all  spectral  bands  of  practical  Interest.  Reference  7 
contains  an  in-depth  survey  of  the  most  important  experimental 
results  derived  from  such  programs  and  also  provides  a  detailed 
analysis  of  how  the  data  may  be  affected  by  environmental  factors. 

Unfortunately,  of  the  many  sources  available  in  the  litera¬ 
ture,  only  Ref.  9  offers  data  processed  in  a  form  that  is  directly 
applicable  to  the  mathematical  models  used  in  this  paper.  What 
is  needed  particularly  are  means  and  covariance  matrices,  the 
elements  of  which  depend  upon  the  standard  deviation  for  each 
channel  and  the  correlation  coefficients  between  all  pairs  of 
channels.  It  is  unfortunate  that  data  for  cloud  backgrounds 
have  not  been  published  in  a  similar  form. 

Reference  9  provides  all  of  these  parameters  for  several 
3pectral  channels*  generated  by  a  number  of  different  terrain 
backgrounds,  each  observed  during  four  time  periods — predawn, 
noon,  sunset,  and  midnight.  The  observations  were  made  from 
an  airborne  platform  at  90  deg  and  35  deg  depression  angles  with 
Instantaneous  fields  of  view  (IPOV)  ranging  from  2  to  5  mrad 
at  altitudes  from  1,000  ft  to  1,750  ft.  However,  only  terrain 
backgrounds  were  measured;  no  examples  of  sky,  clouds  or  ocean 

are  included  in  the  collection. 

I - 

Of  course,  the  measurements  were  made  in  a  particular  set  of 
fixed  wavelength  bands.  However,  Ref.  9  recommends  a  method 
of  interpolating  the  measured  data  to  derive  equivalent  approxi¬ 
mate  data  for  other  choices  of  spectral  decomposition. 


Aside  from  the  first  and  second  moment  statistical  param¬ 
eters,  for  each  case  Ref.  9  also  presents  the  data  in  several 
other  forms.  These  include:  (1)  a  histogram  for  each  spectral 
band,  along  with  an  overlay  of  the  Gaussian  probability  density 
curve  defined  by  the  mean  and  standard  deviation  associated  with 
the  histogram,  (2)  area  diagrams  showing  the  size  and  orientation 
of  all  subregions  with  radiance  above  a  2a  and  above  a  3 a  thresh¬ 
old,  (3)  both  the  cross-track  and  in-track  power  spectral  den¬ 
sities  (sometimes  called  the  Wiener  spectra)  for  the  measured 
region.  Figures  1-6,  taken  from  Ref.  9,  are  examples  of  all 
three  graphic  forms  of  data. 

In  many  of  the  cases  presented  in  Ref.  9  the  Gaussian  den¬ 
sity  curve  fits  the  corresponding  histogram  with  remarkable 
accuracy  out  to  the  2,  3,  and  sometimes  even  the  4a  level.  This 
is  especially  true  for  midnight  scenes  that  are  natural  in  ori¬ 
gin,  such  as  a  conifer  forest  or  a  desert,  as  distinguished  from 
land  or  cities.  Figures  1  and  2  show  that  the  fit  is  fairly 
good  for  a  conifer  forest  even  at  noon. 

Other  histograms  are  multi-modal  and  skewed.  However,  for 
many  of  these,  in  the  accompanying  area  diagrams  that  display 
the  thresholded  subregions  of  maximum  radiance,  the  high-tempera¬ 
ture  zones  appear  to  be  relatively  isolated  and  confined  to  one 
or  two  small  areas  in  the  overall  background.*  When  this  Is  the 
case  it  seems  likely  that  the  lesser  modes  appearing  in  the 
histogram  tail  would  not  be  present  If  the  scene  were  broken  up 
Into  smaller  regions  and  a  separate  histogram  of  the  radiance 
distribution  were  constructed  for  each  of  the  newly  formed  regions. 

For  other  cases,  e.g.,  the  city  of  Baltimore,  Maryland  and 
Fort  A.P.  Hill,  Virginia,  to  name  the  most  extreme  examples,  the 
multi-modal  character  of  the  histogram  is  evidently  not  the  re¬ 
sult  of  Isolated  effects  in  the  background.  Instead,  the  high- 
temperature  zones  are  distributed  throughout  the  scene,  and 
therefore  it  must  be  concluded  that  a  Gaussian  distribution  will 
not  adequately  represent  these  data. 
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Cf.  Figs.  3  and  4. 
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FIGURE  4.  Equivalent  elliptical  areas  for 
Michigan  winter  scene  -  noon 


Reference  2  details  the  construction  of  a  set  of  quasi- 
synthetic  statistical  models  for  the  spectral  distribution  of 
radiance  due  to  a  variety  of  potential  targets  and  clutter 
sources.  These  models  were  developed  by  means  of  analysis  based 
on  physical  principles  and  what  appear  to  be  reasonable  assump¬ 
tions  combined  with  empirical  data  gathered  from  a  number  of 
different  references,  including  Ref.  9. 

An  important  application  of  this  work  is  embodied  in  a 
computer  program  called  PALANTIR,  which  Ref.  2  also  describes 
in  some  detail.  From  a  given  set  of  narrow  band  spectral  chan¬ 
nels  PALANTIR  chooses  a  prescribed  number  of  channels,  picking 
those  that  will  provide  the  least  error  when  used  in  connection 
with  a  minimum  error  algorithm  for  discriminating  between  tar¬ 
gets  and  clutter.  The  basis  for  this  choice  is  a  test  which 
depends  upon  the  means  and  covariance  matrices  associated  with 
the  statistical  models. 

In  an  attempt  to  construct  a  theoretical  model  for  spatial 
channels,  Ref.  10  postulates  statistical  homogeneity  for  terrain 
backgrounds,  citing  as  evidence  for  this  assumption  IR  measure¬ 
ments  taken  by  the  Lincoln  Laboratory  at  20  natural  settings  in 
New  England.  Statistical  homogeneity  in  this  case  means  that 
for  the  radiance  distribution  spatially  the  cross-correlation 
between  any  two  pixels  depends  only  on  the  amount  of  their 
separation  and  not  on  the  position  of  either  in  the  scene. 

A  further  assumption  of  Ref.  10,  for  which  the  same  evi¬ 
dence  is  cited,  is  that  the  cross-correlation  is  an  exponential 
function  of  the  separation.  That  is,  for  radius  vectors  £  and 
r*  that  determine  two  points  in  the  plane  of  the  radiance  dis- 
tribution  it  is  assumed  that  the  cross  correlation  K(r,  r') 
between  the  radiance  values  at  the  two  points  has  the  form 

*x,y  «£■  r>  *  (-  ^  -  1*^) 


(4) 


where  L  and  L  are  correlation  distances  in  the  x  and  y 
x  y 

directions  and  (x,y)  and  (x*,  y')  are  the  respective  components 
of  r  and  r * . 

The  model  recommended  by  Ref.  10  for  spatial  correlation  is 
general  enough  to  provide  for  anisotropic  behavior;  however,  its 
functional  form  obviously  depends  upon  the  choice  of  the  coor¬ 
dinate  system.  If  the  spatial  distribution  were  also  assumed 
to  be  isotropic  the  correlation  function  would  be  independent 
of  the  coordinate  system.  If  it  were  also  exponential  it  would 
have  the  form 


£'> 


(5) 


which  is  completely  determined  by  a  single  correlation  distance  L. 

It  is  interesting  to  note  that  Ref.  10  assumes  the  aniso¬ 
tropic  form  (4)  for  the  cross-correlation  because  the  cited 
supporting  data  were  measured  at  a  depression  angle  of  20  deg. 

The  argument  is  that  one  might  expect  a  scale  change  from  in¬ 
track  to  cross-track  linear  distance  measurements  relative  to 
a  statistically  homogeneous  two-dimensional  distribution  be¬ 
cause  of  the  distortion  created  in  the  cross-track  direction  by 
the  depression  angle. 

However,  if  the  appropriate  form  of  the  cross-correlation 
to  account  for  this  distortion  were  indeed  (4)  as  assumed,  then 
for  a  90-deg  depression  angle  the  cross-correlation  would  be 
given  by 


(r,  r? ) 


exp  [. 


which  is  still  anisotropic  despite  the  single  correlation  dis¬ 
tance  parameter  L. 
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The  two-dimensional  Fourier  transform  of  either  correla¬ 
tion  function,  given  by  (4)  or  (5),  is  the  corresponding  power 
spectral  density  or  Wiener  spectrum,  W  (k)  or  W  (k),  in  terms 

A  )  v 

of  a  vector  wave  number  k.  The  two  densities  are  given  by* 


4  V*. 


wr  v(k)  *  —  5  o  2  2  » 

X’y  (l+kj;  ^)(i  +  ky  Ly) 


W(k) 


2nL 

2  T 2  x  3/2  » 


(1+k  L  ) ' 


(6) 


(7) 


where  k  and  k  are  the  Cartesian  components  and  k  is  the  mag- 
x  y 

nitude  of  the  wave  number  vector. 

In  principle,  (6)  or  (7).  might  be  used  to  check  whether 
either  of  the  corresponding  correlation  functions  is  a  good 
model  for  a  given  background  when  the  data  obtained  from  measure¬ 
ments  of  the  background  include  linear  components  of  the  Wiener 
spectra  in  at  least  two  different  directions.  In  fact.  Ref.  9 
does  provide  data  in  this  form  for  every  case  considered  and  for 
correlations  in  both  cross-track  and  in-track  directions  rela¬ 
tive  to  the  scanning  motion  of  the  sensor.  However,  there  is 
no  reason  to  believe  that  the  track  direction  coincides  with 
either  the  x  or  y  direction,  both  of  which  may  be  at  least 
partially  determined  by  the  physical  properties  of  the  back¬ 
ground  distribution  rather  than  by  the  motion  cf  the  sensor. 

Nevertheless,  a  comparison  of  the  cross-track  power  spec¬ 
trum  curve  with  the  in-track  curve  affords  at  least  a  preliminary 
check  on  the  possibility  that  the  radiance  distribution  is 
isotropic,  i.e.,  by  observing  whether  the  curves  are  nearly  the 
same.  Examples  of  distributions  that  may  be  isotropic  do  exist 
in  the  Ref.  9  data,  e.g.,  for  a  conifer  forest  background  ob¬ 
served  at  a  90-deg  depression  angle  at  noon. 

1 - 


See  Appendix  B. 
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Figures  5  and  6,  taken  from  Ref.  9,  contain  Wiener  spectra 
for  this  case.  Curves  for  the  3. 5-3. 9  u  and  4. 5-5. 5  u  bands 
roughly  approximating  the  in-track  spectra  depicted  in  Fig.  6 
are  shown  as  dashed  lines  in  Fig.  5  to  illustrate  the  point. 

However,  an  examination  of  the  conifer  forest  power  spec¬ 
trum  curves  presented  in  Ref.  9  fails  to  disclose  any  that  might 
correspond  to  the  functional  behavior  indicated  by  (7).  In 
every  case  the  spectral  density  either  decreases  too  rapidly  or 
too  slowly  with  increasing  wave  number. 

It  is  possible  that  by  changing  exponents  in  the  denomi¬ 
nator  of  (7),  e.g.,  replacing  the  exponent  ^  with  2  or  with- 
a  better  fit  to  the  experimental  Wiener  spectra  might  be  ob¬ 
tained;  Appendix  B  shows  how  to  calculate  the  corresponding 
correlation  functions  explicitly.  Some  numerical  experimenting 
with  new  exponents  indicates  for  the  conifer  forest  data,  how¬ 
ever,  that  although  changing  exponents  in  (7)  can  improve  the 
fit  somewhat,  at  best  it  can  only  be  made  close  at  two  points 
on  a  given  curve. 

E.  SIMPLE  MODELS  FOR  TARGET  STATISTICS 

In  principle,  it  is  possible  for  a  sensor  to  estimate  the 
mean  and  the  covariance  matrix  elements  for  clutter  statistics 
by  making  sample  measurements  of  the  background  before  a  target 
•arrives  and  updating  these  estimates  periodically.  But  it  is 
even  conceptually  difficult  to  imagine  how  this  information 
might  be  obtained  for  targets  in  general.  The  possibility  of 
using  a  predetermined  catalogue  of  signatures  for  this  purpose 
seems  limited  because  of  the  many  variations  in  range,  aspect, 
altitude,  velocity,  and  position  of  a  target  relative  to  the  sun. 

Fortunately,  in  the  case  of  point  targets  it  is  usually 
reasonable  to  assume  that  covariance  matrix  elements  will  be 
dominated  by  background  statistics.  To  the  extent  that  this 
is  true,  for  discrimination  purposes  it  is  only  necessary  to 
anticipate  the  mean  values  associated  with  the  data  channels 
that  define  a  target's  signature. 


FIGURE  5.  Power  spectra  -  Michigan  winter  scene 
Noon  -  (Angle:  90  deg.)  -  Crosstrack 


FIGURE  6.  Power  spectra  -  Michigan  winter  scene 
Moon  -  {Angle:  90  deg.)  -  Intrack 


As  remarked  in  Ref.  7,  for  spectral  channels  the  observed 
radiance  is  essentially  the  sum  of  two  parts:  the  first  is  due 
to  the  background  except  for  the  area  occulted  by  the  target; 
the  second  is  the  difference  between  the  radiance  due  to  the 
target  and  the  background  radiance  that  would  result  from  the 
occulted  area  if  it  were  not  obscured. 

Included  in  the  radiance  there  should  be  a  part  due  to 
atmospheric  emissions  along  the  propagation  path.  However,  it 
will  be  assumed  here  that  this  contributes  a  negligible  amount 
to  statistical  fluctuations  about  the  mean. 

In  the  case  of  a  point  target,  which,  by  definition,  occupies 
only  a  small  part  of  the  sensor's  footprint,  the  mean  radiance  ob¬ 
served  is  that  obtained  from  a  calculation  of  the  type  suggested 
in  Ref.  7.  The  calculation  is  equivalent  to  a  weighted  average 
JT,  given  by 

JT  =  wc  Jc  +  WT  JT  ,  (8) 

where  JT  is  the  radiance  supplied  by  the  target,  Jc  is  the  mean 
clutter  radiance,  and  the  two  coefficients  and  Wc  are  fractions 
of  the  total  footprint  area  within  and  without  the  occulted  area. 

According  to  a  well-known  theorem,*  If  the  components  of 
an  N-dimensional  vector  J  have  an  N-variate  Gaussian  joint  prob¬ 
ability  distribution  with  the  mean  vector  J  and  the  covariance 

matrix  M,  then  the  components  of  the  M-dimensional  vector  Y 
»  ~ 
resulting  from  the  linear  transformation  obtained  when  J  Is 

multiplied  on  the  left  by  an  M  by  N  matrix  T  (i.e.,  Y  *  T  J) 

will  have  an  M-variate  Gaussian  joint  probability  distribution 

with  the  mean  vector  Y  given  by 

Y  -  T  J 


and  the  covariance  matrix 


T  M  T 


(9) 


Given  the  present  assumptions,  the  background  radiance  Jn 
and  the  target  radiance  JT  may  be  regarded  as  having  a  joint 
bivariate  Gaussian  probability  distribution  for  which  the  mean 

_  A 

vector  has  the  components  and  the  mean  of  and  the  covari¬ 
ance  matrix  has  the  form 


.0  »  0 


In  (10)  it  is,  of  course,  Implicit  that  the  target  and  back- 
ground  radiances  and  are  uncorrelated.  Also,  (10)  repre¬ 
sents  a  limiting  case  in  which  the  standard  deviation  of  JT  is 
vanishingly  small,  so  that  there  are  no  fluctuations  of  JT 
about  its  mean. 

In  accordance  with  (8)  and  the  first  equation  in  (9)  the 
vector  with  the  components  Wc  and  WT  corresponds  to  a  1  by  2 
transformation  matrix  T.  Then,  according  to  the  cited  theorem 
and  (10),  the  combined  target  and  background  radiance  will  have 

a  univariate  Gaussian  probability  distribution  with  a  mean 

2  1 
given  by  (8)  and  a  variance  aT  given  by 

2  .  ,2  2  \ 

Qq  •  (ll) 

A 

In  the  case  of  multiple  spectral  channels  Jc,  JT  and,  in- 

ferentially,  J„  would  all  be  replaced  by  vectors  in  (8).  Then 
u  2 

(11)  would  be  unchanged  in  form  except  that  cr_  would  be  replaced 

2  0 

by  a  matrix;  ac ,  in  fact,  would  be  replaced  by  the  background 

covariance  matrix  associated  with  the  multiple  channels.  That 

is,  for  N  spectral  channels 


iT  "  wc  £c  +  WT  £t  • 


*  w;  mp 

c  i&c 
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where  J„,,  £c  and  Jm  are  N  component  vectors,  is  the  N  by  N 
background  covariance  matrix,  and  M-,  is  an  N  by  N  matrix  that 
may  be  regarded  as  the  effective  target  covariance  matrix. 
According  to  (12)  the  target  and  background  covariance  matrices 
are  proportional. 

A  similar  model  can  be  devised  for  spatial  channels  that 
form  an  N  by  N  pixel  array.  Using  C  to  denote  a  clutter  source, 

T  to  denote  a  target  source,  and  the  case  N  *  3  for  illustration, 
when  the  target  is  absent  the  pixel  array  will  have  the  form 

c,  c,  c 
c,  c,  c 
c,  c,  c 

and  when  the  target  is  present  the  form 

c,  c,  c 

C,  T,  C 
C,  C,  C 

The  case  of  a  target  source  in  the  array  but  not  at  the 
center  will  be  regarded  as  a  case  in  which  the  target  is  absent. 
It  will  be  assumed  that  the  probability  that  a  target  will  be 
anywhere  within  the  array  at  any  given  time  Is  small.  Thus,  the 
cases  for  which  it  is  present  but  not  at  the  center  may  be 
neglected  as  consisting  of  an  insignificant  number  of  events  in 
comparison  with  the  number  of  events  for  which  it  is  absent 
altogether.  That  is,  such  events  will  have  a  negligible  effect 
on  the  clutter  probability  distribution. 

It  will  also  be  assumed  that  the  target  source  is  uncorre¬ 
lated  with  any  background  clutter  source.  Consider,  for  example, 
the  ideal  case  in  which  the  target  exactly  occupies  the  central 
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pixel  so  that  the  clutter  is  completely  occulted.  When  this 
happens  the  covariance  matrix  associated  with  presence  of  a 
target  must  be  such  that  the  cross-correlation  between  the  cen¬ 
tral  pixel  and  every  other  pixel  in  the  array  is  zero. 

Then 


J*e  "  . 


(13) 


where  M_  has  the  elements  M, ,  and  AM  has  the  elements 
fcC  -J 


‘ij  =  V  5iO  + 


M.  6  .  -  M  6 . 

io  J  o  00  io 


6Jo  -  4  Sio  5Jo  •  M 


In  (14)  the  notation  described  earlier,  according  to  which  the 
subscripts  are  all  two-component  vectors,  is  to  be  assumed.  The 
quantities  6^  are  Kronecker  deltas,  which  vanish  except  when 
the  vectors  i  and  j  are  identical. 

It  is  easily  verified  that  (13)  and  (14)  define  a  target 
covariance  matrix  with  the  appropriate  properties.  That  is, 
when  either  i  or  j ,  but  not  both,  is  the  zero  vector  the  corre¬ 
sponding  pixel  is  being  correlated  with  the  center  of  the  array, 
and,  as  it  S'hould,  the  corresponding  matrix  element  of  M_  vanishes. 
When  neither  i  nor  j  is  the  zero  vector  neither  of  the  pixels 
is  at  the  array's  center,  and,  as  it  should  be,  the  corresponding 
element  of  is  identical  with  that  of  M„.  Finally,  when  both 

i  and  j  are  the  zero  vector  the  pixel  is  at  the  array's  center, 

2 

and  the  corresponding  element  of  is  which,  as  it  should  be, 
is  the  variance  of  the  target  source. 

For  the  case  in  which  the  target  source  occupies  only  part 
of  the  central  pixel  the  terms  in  ( 1 4 )  will  be  weighted  as  in 
(12).  Analysis  similar  to'  that  used  to  derive  the  weight  factors 
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for  spectral  channels  will  provide  the  appropriate  weight  fac¬ 
tors  for  spatial  channels.* 

One  other  concern  in  modeling  the  signal  produced  by  an 
IR  sensor  should  be  mentioned.  Most  systems  provide  contrast 
rather  than  absolute  measurements  of  the  radiance  distribution 
in  a  scene. 

Since  the  contrast  is  approximately  the  difference  between 
the  radiance  values  observed  at  two  successive  pixels  along  a 
scanline  in  the  scene,  its  measurement  is  equivalent  to  apply¬ 
ing  a  linear  filter  (high  pass)  to  the  spatial  channels.  The 
'effect  of  linear  spatial  filtering  in  general  is  discussed  in 
Section  B  of  Chapter  III. 


*TT  the  region  occupied  by  the  target  source  is  larger  than  the 
central  pixel,  should  contain  additional  terms  with  fac¬ 
tors  of  the  form  Mil,  and  weights  given  by  the  components 

of  a  vector  associated  with  the  rows  and  columns  twice  removed 
from  the  central  pixel. 
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III.  CFAR  TARGET  DETECTION  ALGORITHMS 


A.  OPTIMUM  SEGMENTATION 

The  term  "segmentation"  is  used  in  image  processing  liter 
ture  to  denote  the  process  of  separating  different  classes  of 
objects  in  a  scene.  Generally,  this  is  done  with  the  purpose 
of  minimizing  the  probability  that  there  will  be  an  error  in 
the  classification.  However,  for  the  applications  of  interest 
to  this  paper,  in  separating  targets  from  clutter  it  is  more 
important  to  set  a  bound  on  the  probability  of  false  alarm. 
This  is  equivalent  to  prescribing  a  constant  false  alarm  rate 
(CFAR),  which  is  a  goal  common  to  many  IR  systems. 

Given  the  CFAR  condition,  the  problem  of  optimizing  the 
segmentation  may  be  restated  as  follows.  Among  all  possible 
rules  for  detecting  the  presence  of  a  target  with  a  given  fals 
alarm  probability,  find  the  rule  for  which  the  probability  of 
the  detection  is  a  maximum. 

Appendix  A  derives  the  general  solution  of  this  problem 
in  terms  of  the  joint  probability  density  Pc  (J)  for  the  dis¬ 
tribution  of  radiance  values  over  the  available  data  channels 
in  the  absence  of  a  target  and  the  corresponding  Joint  proba¬ 
bility  density  (J)  when  the  target  is  present.  As  in  Chap¬ 
ter  II,  J  Is  an  N-dimensional  vector  each  of  whose  components 
is  the  radiance  value  in  one  of  the  channels. 

The  solution  Is  to  declare  that  a  target  is  present  when 
the  measured  components  of  J  are  such  that  J  defines  a  point  : 
a  certain  N-dimensional  region  R^.  The  boundary  for  this  reg: 
is  a  hypersurface  that  is  determined  by  the  equation 

log  P_  (J)  -  log  P_  (J)  ■  constant.  (8) 

T  <v  U 
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The  constant  in  (8)  is  determined  by  the  condition 


J 


PG  (J)  dJ,...dJN  -  * 


(9) 


which  is  equivalent  to  the  CFAR  requirement  that  the  probability 
of  a  false  alarm  be  equal  to  <f> . 

For  the  case  of  N-variate  Gaussian  probability  densities 

PT  (J)  and  (J)  with  mean  vector  and  £c  and  covariance 

matrix  PL  and  Mc,  respectively,  (8)  and  (9)  reduce  to 
IS-*"  52'*' 

«CJ)  5  (J-ir)*  gi1  y-5c>  *  T  •  OM 


where  y  is  a  positive  constant  determined  by 


[-§  (£-£c}t  fc1  (£"£c) J  dJ,...dJN  -  ♦.  (11) 

r(y) 

In  (10)  Q(J)  is  obviously  a  quadratic  function  of  the  N'compo- 
nents  of  J,  so  that  the  hypersurface  defined  by  (10)  is  a  quad¬ 
ric  surface  (e.g.,  a  conic  section  in  the  case  N=2).  In  (11) 
the  region  of  integration  R(y)  consists  of  all  points  J  for 
which 


V27r)N  iJ^ol 


fie'  / 


Q ( J )  <  y  .  (12) 

The  reason  why  hvi)  is  defined  by  (12)  instead  of  by 

Q(J)  >  Y 
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is  that  J •*  J_  satisfies  (12)  (this  follows  from  the  fact  that 
Mc,  and  therefore  ,  must  be  positive  definite);  therefore, 
the  mean  vector  for  the  probability  distribution  when  a  target 
is  present  corresponds  to  a  point  in  the  region  defined  by  (12). 
For  a  practical  case,  in  which  the  probability  of  a  target  de- 


exp  (^-<£T)t  Jg1  (J-£t)]  dJ,...dJN,  (13) 

R(y) 


tection,  given  by 


PTC 


y  (2ir) 


is  large  enough  to  be  of  any  use,  the  mean  vector  would  have 
to  represent  a  point  in  the  region  R(y)  of  integration  in  (13). 

The  use  of  (12),  subject  to  the  condition  (11),  as  a  test 
to  determine  whether  the  presence  of  a  target  should  be.  declared 
is  a  somewhat  less  formidable  problem  numerically  when  the  co- 
variance  matrices  M_  and  M-,  are  both  diagonal.  This  will  be 
true  only  if  all  N  data  channels  are  mutually  independent  in 
the  statistical  sense. 

If  the  covariance  matrices  are  not  diagonal  there  exists 

a  linear  transformation  of  the  vector  J  to  a  vector  J'  such  that 

0*0  ^ 

the  probability  densities  PC(J’)  and  P^tJ1)  will  both  have  co- 
variance  matrices  that  are  diagonal.  This  follows  from  the 
theorem  used  to  derive  the  expressions  (9)  in  Chapter  II  and 
the  well-known  fact  that  there  Is  always  a  linear  transformation 
that  can  diagonalize  any  two  symmetric  matrices  simultaneously 
as  long  as  one  of  them  is  positive  definite.*  Appendix  C  de¬ 
scribes  the  process  of  finding  the  required  transformation  and 
carries  it  out  in  detail  for  the  point  target  case  in  connection 
with  spatial  channels. 


When  and  Mc  are  both  diagonal,  (10),  which  represents  a 
hypersurface,  has  the  form 


By  setting  all  of  the  Jn  in  (14)  equal  to  zero  except  for  two 
values,  v  and  y,  it  is  possible  to  obtain  the  two-dimensional 
cross-section  of  the  region  bounded  by  the  hypersurface  (which 
is,  itself,  an  N-l  dimensional  manifold)  in  the  v,  y  plane. 
This  cross-section  will  be  the  region  bounded  by  the  curve 


V*  +  KuJy  “  2  Vv  ‘  2  Vy 


(15) 


where  the  and  C^,  i  *  v,  y,  in  (15)  are  defined  by 


and 


K. 


2 


1 

.2 
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y2 

XTi 


ici  ± 
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v,  U 


K  *  y  - 


^Tv  ^Cv  .  ^Ty  JCy 

2  *  2  *2*2 

xTv  aCv  °Ty  aCy , 


The  cross-section  determined  by  (15)  is  then  clearly  an 
ellipse  when 


'  °Tv  >  cCv  and  °Tm  *  %  > 


3* 


a  parabola  when  either 


ctTv  "  °Cv  or  aTu  *  aCu 
but  not  both,  an  hyperbola  If  either 

°Tv  »  °Cv  and  cTu  *  or  “Tv  <  aCv  and  dT„  *  aCU  * 
and  a  straight  line  if 

°Tv  ’  °Cu  and  °Tv  ’  °Cu  ' 

A  case  of  particular  interest  is  that  for  which  the  co- 
variance  matrices  and  are  identical.  When  this  is  true 
(10)  becomes 

r1  j  -  Y  +  j*  M-1  Jc  -  j‘  M-1  jT  ,  (16) 

where 

’It- i  c  • 

and  M  is  the  common  covariance  matrix.  Equation  ( 1 6 )  has  the 

«y 

form 

N 

Wt  J  »  7  .  W„  J  *  constant,  (17) 

»v  <*-  n  n 

n»l 

where  W  is  the  vector  given  by 


The  constaat  on  the  right  side  of  (17)  is  to  be  determined 
by  the  CPAR  condition  (11).  This  suggests  that  a  change  of 
variables  such  that  one  of  the  new  variables  J  is  given  by 


N 

J  -  Y,  "n  J"  U9> 

n*l 

might  be  useful.  Then  the  PPA  given  by  (11)  and  the  PTD  given 
by  (13)  will  depend  only  on  the  respective  C  and  T  marginal 
probability  distributions  for  J. 

According  to  the  theorem,  cited  in  Chapter  II,  concerning 
the  effect  of  a  linear  transformation  on  a  multivariate  Gaussian 
probability  distribution,  the  two  probabir ity  distributions  for 
J  are  univariate  Gaussian  with  means  J.,  i  *  T,  C,  given  by 

N 

-  X*  5,  *  Wn  5in*  1  '  T>  c  (20) 

n*l 

2 

and  a  common  variance  a  given  by 

a2  =  W*  M  W  (21) 


The  optimum  target  discrimination  result  for  the  univariate  case 
derived  In  Appendix  A  now  applies.  That  is,  with  v  defined  by 
the  CPAR  condition  • 


1 


Trr 


» 


dx  ■  $ 


(22) 


if  AJ  >  0,  then  a  target  is  declared  to  be  present  when 

J  >  Jc  +  crv  ; 

if  AJ  <0,  then  a  target  is  declared  to  be  present  when 

J  <  Jc  -  av 


(23) 


(24) 


B.  LINEAR  FILTERS  * 

As  observed  in  Section  A,  the  optimum  CPAR  rule  for  target 
detection  is  non-linear,  in  fact  quadratic,  unless  the  covariance 
matrices  and 

rithms  for  spatial,  or  for  that  matter  temporal,  channels  are 
based  on  thresholding  after  the  application  of  a  linear  filter. 

The  simplest  example  is,  perhaps,  the  temporal  filter  that 
is  sometimes  referred  to  as  Moving  Target  Identification  (MTI), 
for  which  the  basic  idea  is  to  detect  a  target's  motion  rela¬ 
tive  to  what  is  presumed  to  be  a  stationary  background.  It  is 
usually  proposed  for  a  staring  system. 

In  this  connection  a  single  temporal  channel  is  a  frame 
that  consists  of  the  radiance  distribution  over  the  entire 
scene  at  a  given  instant  of  time.  A  sequence  of  such  frames 
constitutes  a  set  of  temporal  channels,  Just  as  an  array  of 
pixels  constitutes  a  set  of  spatial  channels. 

The  first-order  MTI.  filtering  process,  the  first  differ¬ 
ence,  consists  of  subtracting  one  of  two  successive  frames  from 
the  other.  If  a  moving  target  is  present  but  the  background  is 
fixed,  this  difference  will  be  zero  everywhere  except  at  the 
two  target  positions,  one  in  each  frame. 

The  major  problem  encountered  by  MTI  is  the  difficulty  of 
maintaining  registration  for  the  background  from  one  frame  to 


are  identical.  However,  most  detection  algo 
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the  next.  Any  motion  of  the  sensor  will  cause  an  apparent  move¬ 
ment  in  the  background. 

Prom  one  point  of  view  this  is  a  problem  of  correcting 
platform  instability.  However,  for  motion  that  is  slowly  vary¬ 
ing  or  smooth  (as  .distinguished,  for  example,  from  jitter)  more 
complex  temporal  filtering  may  reduce  or  eliminate  the  error. 
Common  filters  for  this  purpose  are  second-  or  higher-order  dif¬ 
ferences  . 

All  such  filters  are  linear  and,  in  fact,  are  special  cases 
of  a  sliding  window  weighted  average,  also  known  as  a  convolu¬ 
tion.  The  general  sliding  window  temporal  filter  is  a  linear 
transformation  of  the  form 


(N-l) 

2 

Jv  •  12  Wn  Jn.v  •  <25> 

n--(N-l) 

2 

from  a  sequence  of  radiance  values  Jv  at  an  arbitrary  pixel 
common  to  each  frame  to  the  sequence  Jy.  After  the  transforma¬ 
tion  is  applied  each  term  of  the  new  sequence  is  usually  thresh- 
olded  and  averaged,  or  averaged  and  thresholded,  to  form  a  sim¬ 
ple  spatial  distribution  which  can  then  be  processed  further, 
as  a  spatial  scene,  to  detect  and  locate  possible  targets. 

Sliding  window  spatial  filters  that  operate  on  a  two  di¬ 
mensional  array  of  pixels  rather  than  a  one-dimensional  sequence 
of  frames  define  analogous  convolution  transformations: 

V-Ev.  •  (26> 

n 


38 


In  (26)  the  subscripts  are  two-component  vectors  in  accordance 
with  the  notation  introduced  in  Chapter  II.  The  sum  is  taken 
over  all  pixels  in  an  N  by  N  array,  for  which  N  is  an  odd  in¬ 
teger  and  the  two-component  vector  v  locates  the  pixel  at  the 
center  of  the  array. 

In  discussing  either  (25)  or  (26)  it  is  convenient  to  set 
v  =  C,  which  is  equivalent  to  choosing  a  particular  coordinate 
system  for  the  discussion.  A  simple  way  to  represent  particular 
examples  of  (25)  or  (26),  one  that  has  become  conventional,  at 
least  for  the  two-dimensional  case,  is  to  use  a  mask  consisting 
of  the  weights  Wn  ordered  as  in  the  sequence  or  as  in  the  array. 

Examples  for  the  one-dimensional  (temporal)  case  are: 

(1)  the  first  difference  mask 

(0,  -1,  1), 

(2)  the  second  difference  mask 

(1,  -2,  1), 

(3)  the  third  difference  mask 

(0,  -1,  3,  -3,  1)  . 

These  masks  apply  to  a  sequence  of  discrete  instants  or  time 
frames,  which  may  be  regarded  as  points  along  a  temporal  coor¬ 
dinate  axis. 

Examples  for  the  two-dimensional  (spatial)  case  are  the 
so-called  Laplacian  filters:* 

(1)  the  digital  analogue  of  the  Laplacian  differential 
9  2  3  Z 

operator  j j  j j  is  represented  by  the  mask 
y  x  y 

Ref.  1,  p.  482. 
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/  0,  -1,  0\ 

(-1.  -1 )  . 

\  0,  -1,  0/ 

(2)  a  rotationally  symmetric  nodification  by  the  mask 


(3)  the  digital  analogue  of  the  differential  operator 

— y - -  by  the  mask 

3  x  a  y 


These  masks  apply  to  a  planar  array  of  discrete  pixels. 

The  Laplacian  filters  were  designed  to  detect  edges  in 
a  scene.  The  second  filter,  because  it  is  rotationally  sym¬ 
metric,  is  actually  a  point  detector  and  is  therefore  of  par¬ 
ticular  interest  for  applications  involving  point  targets. 

Starting  with  the  assumption  that  the  spatial  correlation 
function  of  the  background  distribution  has  the  form  (4)  dis¬ 
cussed  in  Chapter  II,  Ref.  10  derives  the  last  filter  as  an 
approximation  for  the  one  that  maximizes  the  signal-to-clutter 
ratio . 

However,  as  observed  in  Chapter  II,  the  correlation  func¬ 
tion  (4)  is  that  of  an  anisotropic  background  oriented  fortui¬ 
tously  to  conform  with  the  track  direction  of  the  sensor  as  it 
scans  the  scene.  If  the  same  derivation  were  applied  after 
assuming  the  isotropic  correlation  function  (5)  instead,  a  com¬ 
pletely  different  type  of  filter,  for  which  the  continuous  ana¬ 
logue  would  be  a  differentio-integral  operator,  would  result. 
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Note  that  in  each  of  the  filter  examples  just  presented 
the  sum  of  the  weights  shown  in  the  mask  array  is  zero.  Fil¬ 
ters  with  this  property  are  termed  high  pass  because  they  com¬ 
pletely  eliminate  a  constant  background  distribution;  i.e.,  they 
eliminate  the  DC  component  of  the  background  distribution’s 
spatial  frequency  expansion. 

For  v  =  0,  (25)  and  (26)  both  have  the  form 

Jt  =  W  J  ,  (27 

/  j  n  n  * 

n 

except  that  the  subscript  n  is  a  scalar  in  one  case  and  a  two- 
component  vector  in  the  other.  If  the  mean  vectors  J0  and  ^ 
and  the  covariance  matrices  MP  and  are  associated  with  N-vari- 
ate  Gaussian  probability  distributions  for  the  Jn  in  the  case 
when  targets  are  absent  and  in  the  case  when  a  target  is  present, 
then  according  to  the  theorem  of  Chapter  II  the  corresponding 

1 

variables  J  have  univariate  Gaussian  probability  distributions 
with  means  and  variances  given  by 

J!  -  W  J,  ,  a?  =  M.  W 
i  '  J  n  in  I  ~ 

n 

According  to  Appendix  A  and  Section  A  of  this  chapter  the 
optimum  CFAR  algorithm  is  non-linear  except  when  the  covariance 
matrices  and  are  identical.  Therefore,  the  use  of  a  linear 
filter  will  be  less  than  optimum  unless  this  is,  in  fact,  true. 

When  the  two  covariance  matrices  are  identical  a  compari¬ 
son  of  (20)  and  (21)  with  (28)  shows  that  the  weights  corre¬ 
sponding  to  the  linear  filter  that  provides  CFAR  optimization 
will  be  the  components  of  the  vector  W  given  by  (18).  It  is 
Interesting  to  note  that  this  filter  is  exactly  the  same  as  the 
one  that  Ref.  1  (pp.  580-561)  shows  will  maximize  the  signal- 


=  W  M,  W  ,  1  =  C,  T.  (28) 
/  j  n  inm  m*  ’ 

n,m 


-  t  _  I  p 

to-noise  ratio  if  the  signal  power  is  identified  with  (JT  -  Jc) 
and  the  ndise  power  with  a2.* 

For  the  statistical  model  proposed  in  Chapter  II  for  spa¬ 
tial  channels,  when  the  target  source  exactly  occupies  a  single 
pixel  the  covariance  matrices  and  will  never  be  equal, 
however.  Since  the  procedure  detailed  in  Appendix  C  for  diagona¬ 
lizing  J2C  and  Mt  simultaneously  in  this  case  is  easily  imple¬ 
mented/'’^  may  be  simpler  to  use  the  true  optimum  CFAR  detection 
algorithm,  which  is  quadratic,  than  it  would  be  to  obtain  what 
must  necessarily  be  a  sub-optimum  linear  filter. 

Nevertheless,  as  observed  in  the  discussion  in  Chapter  II 
of  the  model  applied  to  spectral  channels,  the  two  covariance 
matrices  are  at  least  proportional.  If  the  proportionality 
constant  is  nearly  equal  to  one,  as  is  usually  the  case  for 
spectral  discrimination,  and  the  magnitudes  of  the  corresponding 
mean  vectors  are  sufficiently  different,  the  linear  filter  whose 
weights  are  given  by  (18)  will  provide  near  optimum  CFAR  dis¬ 
crimination.  This  follows  from  the  fact  that  the  quadratic 
term  in  (A-19)  of  Appendix  A  can  then  be  neglected  in  comparison 
with  the  linear  term. 

C.  TRACKING  ALGORITHMS 

For  IR  systems  that  detect  point  targets  the  false  alarm 
rate  is  the  specification  that  usually  dominates  the  signal 
processing  requirements.  The  desired  rate  may  be  as  low  as  one 
per  hour,  Implying  false  alarm  probabilities  as  small  as  1CT10 
whenever  a  target  Is  declared. 

Only  an  algorithm  composed  of  a  number  of  tests  that  are, 
in  effect,  guaranteed  to  be  mutually  independent  has  any  hope 
of  achieving  so  small  a  PFA.  Such  a  guarantee  may  be  possible 
for  an  algorithm  based  on  temporal  discrimination  if  the  inter¬ 
val  between  successive  time  frames  is  sufficiently  large.  That 
is,  the  interval  must  be  larger  than  any  correlation  time  asso¬ 
ciated  with  spatial  or  spectral  discriminants. 

» 


Cf.  also  Ref.  10. 
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For  a  staring  system,  MTI  differencing,  as  described  in 
Section  B,  will  tend  to  remove  whatever  correlated  fal.se  alarms 
may  result  from  preliminary  target  detection  algorithms  that  are 
applied  to  the  spatial  or  spectral  channels.  Scanning  systems, 
on  the  other  hand,  generate  false  alarms  that  are  spatially 
correlated  when  they  are  separated  by  less  than  a  correlation 
distance  associated  with  the  background  radiance  distribution. 
One  method  of  eliminating  this  kind  of  dependence  has  been  to 
treat  any  group  of  detections  that  cluster  so  closely  as  a 
single  detection  located  at  the  centroid  of  the  group. 

Most  of  the  detections  resulting  from  the  CFAR  algorithms 
will,  of  course,  be  false  alarms.  The  final  decision  that  a 
target  is  present  will  be  referred  to  here  as  a  target  declara¬ 
tion  to  distinguish  it  from  the  CFAR  detections  established 
before  this  decision  process  is  invoked. 

Systems  that  are  required  to  maintain  very  low  false  alarm 
rates  usually  rely  upon  tracking  algorithms  to  provide  target 
declarations.  Those  are  algorithms  that  distinguish  between 
target  and  clutter  sources  by  means  of  the  presumed  trajectory 
characteristics  of  such  sources  when  they  are  observed  in  mo¬ 
tion  over  several  time  frames. 

A  tracking  algorithm  must  deal  with  two  types  of  trajec¬ 
tories:  the  non-accidental,  which  is  due  to  the  real  motion 
of  a  source  relative  to  the  IR  sensor,  and  the  accidental,  which 
is  due  to  a  random  Juxtaposition  of  clutter  sources.  Because 
the  first  type  occurs  in  great  variety,  according  to  the  sce¬ 
nario,  the  environment,  and  the  system  configuration,  the  effec¬ 
tiveness  of  an  algorithm  in  dealing  with  it  is  difficult  to 
evaluate  except  on  a  case-by-case  basis.*  However,  it  is 


Ref.  3  (pp.  310-330)  describes  a  number  of  tracking  algorithms 
that  have  been  used  for  image  processing.  The  list  is  far  from 
exhaustive,  however;  recent  IR  system  designs,  for  example, 
have  introduced  tracking  algorithms  based  on  trajectory  charac¬ 
teristics  that  do  not  seem  to  have  been  exploited  previously. 


••  '..vX^r 
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possible  to  estimate  with  some  generality  an  algorithm’s  effec¬ 
tiveness  in  dealing  with  the  second  type  of  trajectory. 

First  of  all,  to  the  extent  that  the  preliminary  CFAh  algo¬ 
rithms  perform  their  assigned  function,  it  may  be  assumed  that 
every  false  alarm  occurs  with  the  same  specified  CFAR  probabil¬ 
ity  <J>.  Suppose  that,  in  order  to  declare  a  target,  the  tracking 
algorithm  requires  the  formation  of  some  spatial  pattern  by  a 
minimum  of  r  detections,  one  from  each  of  r  different  time 
frames.  Suppose  also  that  n  is  the  total  number  of  possible 
detection  combinations  that  can  form  such  a  pattern.  Then  the 
probability  that  the  algorithm  will  generate  a  false  target 
declaration  because  of  random  false  detections  is  given  by 

P  =  1  -  (l-<j>r)n  .  (29) 


Suppose  that  the  system’s  false  alarm  rate  specification 
implies  a  probability  of  false  target  declaration  no  greater 
than  PQ.  Then  from  (29)  it  follows  that  r,  which  is  the  mini¬ 
mum  number  of  detection's  required  to  establish  a  target  track, 
will  be  determined  by  the  inequality 


r  * 


log  [i-  d-po)1/n] 

_____ 


(30) 


Since  the  original  reason  for  invoking  the  tracking  algo¬ 
rithm  was  the  premise 


P  «  1  , 

o  * 


(30)  is  essentially  equivalent  to 


log  PQ-log  n 
~  log  <p  ' 
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r  * 


(31) 


The  right  side  of  (3D  approximates  the  right  side  of  (30)  with 
an  error  whose  absolute  value  will  certainly  be  less  than  0.5; 
thus,  the  two  inequalities  will  lead  to  the  same  bound  when 
rounded  off  to  the  nearest  integer. 

The  number  n  has  a  simple  estimate  which  can  be  derived  as 
follows.  Suppose  that  each  time  frame  contains  a  total  of  m 
pixels  and  that  from  one  frame  to  the  next  each  detection  may 
be  followed  by  a  detection  at  any  of  k  different  pixels.  If  k 
is  the  same  for  each  successive  frame  in  the  set  that  deter¬ 
mines  the  admissible  track,  then 

n  *  mkr-1  .  (32) 


If  k  varies  it  can  be  replaced  by  an  average  (geometric)  vaiue 
estimate,  or,  if  the  aim  is  to  be  conservative,  by  an  upper 
bound.* 

A  substitution  from  (32)  into  (3D  leads  to 


log  PQ-log  m  -  (r-D  log  k 


which,  in  turn,  provides  the  result 


log  PQ-log  m  +  log  k 
log  k  +  log  4> 


provided  that 

k  <  j  .  (34) 

Unless  (34)  is  satisfied  no  positive  value  of  r  is  possible 
In  that  case  the  algorithm  cannot  meet  the  false  alarm  rate  goal 

*It  is  certainly  a  tracker  objective  to  make  k  a  rapidly 
decreasing  function  of  r. 
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The  number  k  Is  a  measure  of  the  amount  of  branching  per¬ 
mitted  by  the  tracking  algorithm,  usually  in  order  to  allow  for 
trajectory  turns  and  an  error  tolerance.  Therefore,  the  con¬ 
dition  (34)  implies  that  the  complexity  of  “he  tracking  algo¬ 
rithm  will  be  limited  by  the  C PAR  specifica  ion  that  the  pre¬ 
liminary  detection  algorithms  are  able  to  meet. 

7 

As  an  example,  consider  the  case  in  which  there  are  10 
pixels  in  the  entire  scene,  the  CFAR  algorithms  dispose  of  99 
percent  of  the  background  pixels,  and  it  is  required  that  the 
probability  of  a  false  target  declaration  be  less  than  lO-*0. 
Then  m  *  10^,  $  *  0.01,  and  Pq  *  10“^®.  If  no  branching  is 
allowed,  so  that  k  *  1,  according  to  (33)  the  number  r  of  detec 
tions  that  must  be  considered  in  the  target  declaration  algo¬ 
rithm's  trajectory  pattern  before  a  target  can  be  declared  is 
greater  than  8.5,  i.e.,  9  or  more.  If  two  branches  are  allowed 
r  must  be  10  or  more,  if  3,  11  or  more,  and  if  4,  12  or  more. 

Figure  7  contains  a  curve  that  depicts  the  lower  bound  on 
r  as  a  function  of  k.  As  the  figure  indicates,  k  must  be  less 
than  100  because  of  the  limitation  imposed  by  (34). 

Figure  7  also  contains  a  second  curve:  for  the  case  In 
which  <j>  Is  equal  to  0.001  (99- 9%  of  the  background  pixels  are 
eliminated  by  the  preliminary  detection  algorithms)  but  the 
other  parameters  have  the  same  values  as  in  the  first  case. 

An  Inspection  of  this  curve  reveals  that  the  Increase  in  effec¬ 
tiveness  of  the  detection  algorithms  permits  a  large  Increase 
In  the  number  of  allowed  branches  for  a  given  number  of  detec¬ 
tions  in  the  track  pattern.  For  example,  for  $  *  0.01  no  track 
with  fewer  than  9  pixels  is  satisfactory,  and  even  if  there  are 
9,  only. one  branch  Is  permitted;  however,  for  4>  *  0.001  a  track 
with  as  few  as  6  pixels  Is  adequate,  and  if  there  are'  9  pixels 
as  many  as  17  branches  are  permitted. 
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FIGURE  7.  Minimum  number  of  detections  for  a  tracking 
algorithm  versus  number  of  branches 


IV.  AN  EVALUATION  OF  TWO  COMMON 
IR  SIGNAL  PROCESSING  TECHNIQUES 


A.  INTRODUCTORY  REMARKS 

This  chapter  will  discuss  two  unrelated  techniques  that  are 
included  in  some  IR  signal  processing  approaches  to  target  de¬ 
tection.  The  first  technique,  which  is  sometimes  called  back¬ 
ground  normalization  (Ref.  12),  is  a  method  of  setting  a  detec¬ 
tion  threshold  that  is  adapted  to  the  spatial  variation  of  the 
background  clutter.  The  second,  which  is  incorporated  in  cer¬ 
tain  two-  and  three-color  spectral  discrimination  algorithms,  is 
a  way  of  reducing  the  number  of  degrees  of  freedom  in  the  data 
by  using  ratios  of  the  spectral  components  rather  than  the  com¬ 
ponents  themselves. 

The  aim  of  the  discussion  will  be  to  compare  the  effective¬ 
ness  of  the  techniques  with  that  of  alternative  approaches. 

The  analysis  that  addresses  this  question  here  is  actually  an 
extension  of  the  analysis  in  Appendices  A  and  E  of  Ref.  4, 
which  deal  with  the  same  topics  in  a  more  general  way. 

Appendix  A  of  Ref.  4  characterizes  background  normaliza¬ 
tion  in  terms  of  an  idealized  version  of  the  process.  The 
present  chapter  will  consider  the  specific  process  as  it  is 
ordinarily  implemented. 

Appendix  B  of  Ref.  4  derives  some  general  implications  of 
the  use  of  spectral  component-  ratios  in  three-color  systems. 

Here  the  concern  will  be  with  the  probabilities  of  false  alarm 
and  target  detection.  For  simplicity,  the  discussion  will  con¬ 
centrate  on  two-color  discrimination  algorithms,  although  it  Is 
reasonable  to  suppose  that  the  conclusions  hold,  at  least  quali¬ 
tatively,  for  multi-spectral  algorithms  in  general. 
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The  treatment  of  both  topics  is  self-contained  in  this 
paper.  Nevertheless,  there  is  not  much  overlap  with  the  material 
in  Ref.  which,  therefore,  night  well  furnish  certain  insights 
that  the  present  discussion  fails  to  provide. 

B.  BACKGROUND  NORMALIZATION 

Background  normalization  is  a  particular  implementation 
of  a  general  process  called  adaptive  thresholding  (cf.  Ref. 
13-18).  The  fundamental  objective  of  signal  processing,  of 
course,  is  to  set  a  detection  threshold  that  is  high  enough  to 
reject  background  clutter  but  low  enough  to  pass  a  target  sig¬ 
nal.  When  the  threshold  selection  varies  with  the  local  back¬ 
ground  distribution,  i.e.,  is  spatially  adaptive,  under  CFAR 
conditions  the  target  detection  probability  can  be  made  larger 
than  would  be  possible  if  the  threshold  were  fixed  for  a  whole 
scene . 

Background  normalization  is  essentially  a  method  of  esti¬ 
mating  the  clutter  that  would  be  observed  at  a  given  point  P 
in  the  absence  of  a  target.  This  estimate  then  provides  a 
basis  for  setting  a  separate  CFAR  threshold  for  each  point  in 
the  scene. 

The  prescribed  estimate  is  Just  the  average  of  the  radi¬ 
ance,  or  of  some  function  of  the  radiance  (e.g.,  its  square), 
measured  at  points  surrounding  P  in  a  symmetrical  pattern.  For 
the  scene  as  a  whole  the  process  amounts  to  a  transformation  of 
the  background  distribution  by  means  of  a  sliding  window  average, 
which  is  a  special  case  of  the  linear  transformations  discussed 
in  Chapter  III. 

A  simple  example  is  the  transformation  defined  by  the  mask 

Cl/16,  1/16,  1/16,  1/16,  1/16  \ 

1/16,  0,  0,  0,  1/16  \ 

1/16,  0,  0,  0,  1/1 6 

1/16,  0,  0,  0,  1/16 

1/16 ,  1/16,  1/16,  1/16,  1/16  / 


The  point  P  corresponds  to  the  central  pixel  in  the  window,  and 
its  eight  neighbors  are  reserved  as  a  guard  against  a  possible 
overlapping  signal  from  a  target  source  that  might  have  more 
spatial  extent  than  was  anticipated. 

It  is  convenient  in  discussing  the  general  case  to  intro¬ 
duce  a  Cartesian  coordinate  system  chosen  so  that  the  point  P 
is  located  at  the  origin;  i.e.,  P  will  always  have  the  coordi¬ 
nates  (0,0).  It  will  be  assumed  that  the  coordinates  (x,y)  of 
any  ether  pixel  in  the  window  are  integral  multiples  of  a  fixed 
quantity  Ax  in  the  horizontal  direction  and  a  fixed  quantity  Ay 
ir.  the  vertical  direction.*  Then,  in  an  m  by  n  window  the  pixels 
will  be  located  at  the  points  (x^,  y  ),  for  which 

xy  =  vAx,  pp  *  v  *  , 

(35) 

a  1-m  ^  m-1 

yp  *  yAy,  pp  £  y  £  -*2~  . 

If  the  continuous  background  spatial  distribution  is  given 
by  a  function  S(x,y),  the  measured  radiance  (or  a  given  function 
of  the  radiance)  at  each  point  (xy,  y^)  will  be  S(xv,  yy)  in  the 
absence  of  ^  target.  Then  background  normalization  consists  of 
the  assignment 


M  XI  S(V 


where  the  sum  is  taken  over  a  particular  set  of  M  points 
(xv>  y^)  out  of  the  m  n  points  in  the  window. 

The  set  must  satisfy  Just  two  conditions:  (1)  P  is  not  a 

member;  i.e.,  if  (x  ,  y  )  is  in  the  set  then  either  x  /  C  or 

v  y  v 

1 - 

Usually  ax  and  Ay  will  be  the  same,  but  there  is  no  particular 
advantage  in  assuming  this  restriction  here. 
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yy  ^  0;  (2)  the  points  that  are  members  are  located  symmetri¬ 

cally  with  respect  to  P;  i.e.,  if  (xy,  y  )  is  in  the  set  then 
so  is  (x_v,  y  ).  It  is  evident  from  the  second  condition  that 

*  ■  I  £  "  0  and  ?  =  H  Zi  y)i  * 0  >  (37) 

v  u 

where  the  barred  quantities  are  averages  of  the  indicated  coor¬ 
dinates,  calculated  v/ith  respect  to  all  points  in  the  set. 

The  assignment  (36)  amounts  to  an  interpolation  of  the  back¬ 
ground  distribution  to  the  point  P  from  measured  values  observed 
at  the  M  selected  points  (*v>  y  ).  It  was  demonstrated  in 
Appendix  A  of  Ref.  4  that  background  normalization  is  consist¬ 
ent  with  a  power  series  approximation  that  is  valid,  in  gen¬ 
eral,  up  to  the  linear  order.  Therefore,  it  is  natural  to  ask 
how  it  compares  v/ith  an  optimum  linear  interpolation  from  the 
given  data. 

An  obvious  choice  for  the  comparison  would  be  the  estimate 
obtained  from  the  linear  function  . 


S(x,y)  =  ax  +  by  +  c  (38) 

that  fits  the  given  data  with  the  least  square  error.  That  is, 
the  coefficients  a,  b,  and  c  in  (38)  are  to  be  determined  from 
the  condition  that 

e  -  ^  [a  xv  +  b  yy  +  c  -  S(xv,  y^)]*  (39) 


be  a  minimum,  where  again  the  sum  is  taken  ever  the  M  sample 
data  points. 

The  standard  method  of  calculating  the  coefficients,  i.e., 
differentiating  e  with  respect  to  a,  b,  and  c  separately  and 
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setting  each  derivative  equal  to  zero,  leads  to  the  system  of 
equations 


^  xv  Ca  xv  +  b  yu  +  G  "  S(xv>  yu)]  “  0  » 
v,u 

yy  [a  xv  +  b  yy  +  c  -  S(xv,  yy)]  =  0  ,  (40) 

v,U 

^  [a  xv  +  b  yy  +  c  -  S(xv,  yy)]  =  0  . 

Because  of  (37)  the  last  equation  reduces  to 

<=  -  ID  SUV,  yu)  .  («) 

Eut  according  to  (38)  the  linear  interpolation  for  the  back¬ 
ground  at  P  is  given  by 

S(0,  0)  *  c  .  (42) 

A  comparison  of  (36),  (4l),  and  (42)  shows  that  the  least- 
square-error  linear  interpolation  for  the  radiance  (or  a  given 
function  of  the  radiance)  at  P  is  identical  with  the  estimate 
given  by  background  normalization. 

Since  there  are  only  three  parameters  to  be  determined  for 
the  linear  fit  indicated  by  (38),  it  can  be  accomplished  as  long 
as  there  are  more  than  three  measured  values  of  S(x,y).  Gener¬ 
ally,  there  will  be  more — e.g.,  eight  in  the  case  of  a  3  x  3 
element  window,  or  at  least  sixteen  in  the  case  of  a  5  x  5  ele¬ 
ment  window. 


This  suggests  the  possibility  of  improving  the  background 
normalization  technique  by  using  an  interpolation  based  on  a 
square  error  fit  of  the  data  to  a  quadratic  instead  of  a  linear 
polynomial.  That  is,  (38)  would  be  replaced  by 

Slx.y)  =  c  +  ax  +  b  y  +  An  x2  +  2  A12  xy  +  Ajj  y2  ,  (#3) 

and  the  coefficients  c,  a,  b,  A^,  A^2,  A22  would  be  determined 
so  as  to  minimize  the  square  error 

e  =  yu)  -  S(xv,  yy)]2  .  (M) 

v  u 


The  resulting  value  of  c,  once  again,  would  be  the  least  square 
error  estimate  S(0,  0)  of  P. 

The  argument  (based  on  the  symmetrical  distribution  of 
data  points  about  P)  used  to  obtain  (37)  also  implies  that 


7-7-  <> 


(*5) 


It  will  be  found,  as  a  result,  that,  on  setting  the  derivatives 
of  e  with  respect  to  each  of  the  six  coefficients  In  (^3)  equal 
to  zero,  only  three  of  the  six  least-square-error  equations  for 
the  coefficients  will  contain  c.  Those  equations  are,  in  fact. 


•  "2  _ 

L  ”2 

c 

x  A11 

+  y 

A22 

=  s  , 

2  2 

~~TZ 

c  + 

x  An  + 

x  y 

A22 

=  x  s 

-r 

.  T 

— 

c  + 

x  y£  A11 

+  y 

A22 

=  y^S 

(*6) 


where  all  barred  quantities  are  averages  over  the  M  sample 
data  points,  e.g. , 

5* 


-!s-sE"5  s(v  V  • 

v,y 

If  the  coefficient  determinant  A  of  (46)  is  different  from 
zero,  Cramer's  rule  will  provide  an  explicit  solution  for  c,  A^, 
and  A, 2 •  However,  only  c  is  of  interest  here.  It  is  given  by 

-  ~T~  T” 

c  =  F  S  -  G  x  S  -  H  y  S  ,  (47) 


where 


T  T  TT  2 
F  =  >;.j: _ 


G  = 


T  T  T  TT 

x  y  -  y  x  y 


H  * 


T  — JT  “T  2  2 
y  x  -  x  x  y 

A 


and  A  is  given  by  the  determinant 


(48) 


A  = 


_  2  2 
1,  x  ,  y 


T  T  "TT 


"T  TT  T 

y  ,  x  y  ,  y 


(49) 


Since  c  is  the  least  square  error  estimate  of  S(0,  0), 
it  is  evident  from  the  form  of  (47)  that  the  least-square-errcr 
estimate  S ^  of  quadratic  order  (replacing  the  linear-order 
estimate  =  S’)  will  be  a  weighted  average;  i.e., 

S2  ■  C  ‘  X)  “vw  S<XV  V  •  <50) 
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The  weights  W 

\>y 

given  by 


obtained  from  an  inspection  of  (47),  are 


F-G  x2  -  H  / 
R  ^ 


F-G  v‘ 


Ax2  - 


i r 


H  yg  Ay‘ 


(51) 


As  an  example,  consider  the  case,  introduced  earlier,  of 
a  5  by  5  pixel  window  for  which  only  the  16  border  pixels  are 
sample  data  points.  For  this  case  the  ordinary  background 
normalization,  or  linear,  estimate  consists  of  the  average 

S1  *  22  wvu  S(V  V  ’  V  -  ±  2  or  U  *  ±  2  , 


with  equal  weights. 


W  =  . 

vy  lb 

For  the  quadratic  interpolation  estimate  it  is  necessary, 
first,  to  calculate  the  averages  x?,  y2,  x%f  y^",  x2y2,  which 
can  be  done  without  much  difficulty  by  using  the  mask  introduced 
earlier  as  a  guide.  The  results  are 

2  2 
5(x2  +  x_2)  +  2  (x2  +  x_1) 

- r5 - 

Z&JLLy  JW  Ax2  .  2.75  Ax2  , 
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T  2 

y  -  2.75  ^  , 


T  h  U  4  4  4 

xH  -  5(x2+x_2)  +  2  (xj+x<_1)  =  10.25  Ax  , 


2  2 
x  y 


10.25  Ay  , 

(Xl+X-].)  (y2+y?2)  +  (x2+x-2}  ^l^-l5  +  (x2+x-2)  (y2+y-2) 


6  Ax2  Ay2  . 


The  determinant  A,  defined  by  (49),  is  therefore  given  by 

A  »  lJ.  78125  Ax4  Ay4  . 

Then  the  calculations  indicated  by  (48)  provide  the  results 

p  =  14.44444  , 

2.44444  , 

G  —  «  ,  ' 

Ax 

„  2.44444 


Finally,  the  weights  can  be  obtained  by  substituting  from  (52) 
into  (51).  The  results  are 


Ed£**lm  0.29: 
16 

F-GAx2  -  4HAv2 


0.29167  , 


±2±1 


±1±2 


±2±2 


F-4GAX2  -  4HAy2 


0.13889  , 


-.31944 


The  corresponding  mask  will  be 


/*-. 31944,  0.13889,  0.29167,  0.13889,  -.31944  \ 

0.13889,  0,  0,  0,  0.13889 

0.29167,  0,  0,  0,  0.29167 

0.13889,  0,  0,  0,  0.13889 

31944,  0.13889,  0.29167,  0.13889,  -.31944  J 

C.  MULTI-COLOR  ALGORITHMS  BASED  ON  SPECTRAL  COMPONENT  RATIOS 

An  N-color  IR  system  collects  data  in  N  spectral  channels 
defined  by  N  distinct,  non-overlapping  wavelength  bands  which 
are  presumably  chosen  because  the  spectral  signatures  that  they 
provide  for  targets  differ  as  much  as  possible  from  those  that 
they  provide  for  backgrounds.  As  is  customary  in  this  paper, 
N-component  vectors  J^,  and  £c  will  represent  the  radiance  dis¬ 
tribution  over  the  channels,  the  first  for  the  case  in  which  a 
target  is  present  and  the  second  for  the  case  in  which  targets 
are  absent . 

Some  two-  and  three-color  target  detection  algorithms  that 
have  been  proposed  do  not  operate  directly  on  the  components 
J^,  i*l,...,N  of  the  vectors  £T  and  Jc ,  but  rather  on  the  ratios 
of  N-l  of  the  J^,  i*l,...,N-l,  to  a  single  component  JN«  That 
is,  the  ratio  variables 


X 


i 


i-1 


,  N-l  , 


(54) 


replace  the  variables  J^,  i»l,...,N,  and  it  is  the  X.^  that 
enter  into  the  target  detection  algorithms.  As  a  result, 
there  must  be  a  certain  loss  of  information  since  the  number  of 
free  variables  will  then  be  reduced  by  one.  The  question  is: 
how  does  this  use  of  ratios  in  a  target  detection  algorithm 
affect  the  false  alarm  probability? 
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For  simplicity  the  discussion  will  be  confined  to  two- 
color  systems,  although  similar  conclusions  may  be  expected  in 
the  case  of  systems  that  employ  three  or  more  colors.  For  two 
colors  it  is  possible  to  construct  a  simple  graphical  represen¬ 
tation  of  the  pair  of  measured  spectral  components  J  and  J^. 

This  is  illustrated  by  Fig.  8  which  depicts  a  planar  coor¬ 
dinate  system  for  points  that  are  defined  when  and  J 2  are 
regarded  as  cartesian  coordinates.  The  figure  represents  a 
data  plane  in  which  every  point  corresponds  to  a  pair  of  measure¬ 
ments  in  the  two  spectral  bands  of  interest,  and  every  pair  of 
such  measurements  corresponds  to  a  point  in  the  plane. 

Assume  that  there  is  a  distinct  bivariate  joint  probability 
distribution  for  (Jn ,  J^)  corresponding  to  the  target  source 
and  another  such  distribution  corresponding  to  the  clutter.  In 
terms  of  its  probability  distribution  a  mean  point  (J,,  J.,)  will 
be  defined  for  the  target  and  another  will  be  defined  for  clutter. 
These  are  indicated  by  the  labels  "target"  and  "clutter"  in  the 
figure . 

To  each  point  in  the  data  plane  there  is  an  associated  line 
through  the  point  and  the  origin  of  the  coordinate  system.  The 
ratio  of  the  corresponding  spectral  components  will  be  equal  to 
the  slope  of  the  line  or  the  reciprocal  of  the  slope,  depending 
upon  how  the  ratio  is  defined.  Figure  8  shews  the  lines  (solid) 
associated  in  this  way  with  the  target  and  clutter  means. 

A  straightforward  discrimination  criterion  is  provided  by 
the  following  rule.*  If  for  a  pair  of  measurements  and 
the  value  of  the  probability  density  function  (FDF)  correspond¬ 
ing  to  the  target  is  greater  than  the  value  of  the  FDF  corres¬ 
ponding  to  clutter,  the  source  is  presumed  to  be  the  target. 
Otherwise,  the  source  is  presumed  to  be  clutter. 

* - 

This  rule  is  introduced  here  instead  of  one  based  on  a  CFAR 
requirement  to  simplify  the  calculations  that  illustrate  the 
major  points  of  Interest.  It  also  makes  it  possible  to  nor¬ 
malize  the  evaluation  to  one  for  which  the  false  alarm  proba¬ 
bility  is  the  figure  of  merit. 


FIGURE  8.  2D  versus  ratio  discrimination 
for  2-color  systems 


Then  the  curve  defined  by  the  equation  that  is  formed  when 
the  target  PDF  is  set  equal  to  the  clutter  PDF  divides  the  data 
plane  into  two  regions:  one  consisting  of  points  regarded  as 
due  to  the  target  and  the  other  consisting  of  points  regarded 
as  due  to  clutter.  This  boundary  is  indicated  in  Fig.  8  by 
the  line  labeled  "2D  discrimination  line". 

Although  the  boundary  is  shown  in  the  .figure  as  straight, 
in  general  it  will  be  a  curve  or,  in  fact,  it  may  even  consist 
of  two  distinct  branches  of  a  curve.  If  the  PDFs  are  both 
bivariate  gaussian  the  boundary  will  be  a  conic  section  (Chapter 
III,  Section  A),  i.e.,  a  parabola,  an  ellipse,  or  a  hyperbola. 

If  the  covariance  matrix  for  the  target  PDF  and  that  for  the 
clutter  PDF  are  identical,  in  the  case  of  gaussian  distributions 
the  boundary  will  be  a  straight  line.  Appendix  A  contains  a 
detailed  discussion  of  these  and  related  matters. 

The  target  and  clutter  probability  distributions  will  each 
induce  a  corresponding  univariate  distribution  for  the  ratio  of 
spectral  components.*  A  discrimination  criterion  similar  to 
that  based  on  the  bivariate  PDFs  can  be  formulated  in  terms  of 
ohe  ratio  PDFs. 

When  the  target  and  clutter  ratio  PDFs  are  set  equal  the 
solution  of  that  equation  provides  a  boundary  between  the  re¬ 
gion  consisting  of  points  designated  as  target  and  the  region 
consisting  of  points  designated  as  clutter  by  the  ratio  dis¬ 
crimination  criterion.  This  boundary  is  a  line  or  lines,  with 
slopes  given  by  the  solution  of  the  equation,  passing  through 
the  origin  of  the  coordinate  system,  and  it  defines  regions 
that  are  angular  sectors.  This  is  illustrated  in  Fig.  8  by  a 
line  labeled  "ratio  discrimination  line". 

It  is  evident  that  the  2D  discrimination  rule  and  the 
ratio  rule  do  not  always  agree.  The  region  labeled  "excess 
false  alarms"  in  the  figure  consists  of  points  that  are  false 

1 

See  Appendix  D  for  a  derivation  of  the  FDF. 
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alarms  from  the  point  of  view  of  the  2D  rule,  and  the  region 
labeled  "excess  missed  targets"  consists  of  points  that  are 
missed  target  detections  from  the  same  point  of  view.  From 
the  ratio  rule  point  of  view  the  same  regions  would  be  appli¬ 
cable  with  the  labels  reversed. 

Figure  9  illustrates  a  similar  data  plane  configuration, 
except  that  in  this  case  the  target  and  clutter  means  have  the 
same  ratio,  although  the  mean  points  are  still  separated  by  a 
considerable  margin.  Note  that  the  ratio  boundary  between 
designated  target  and  clutter  points  consists  of  two  lines  in 
this  example. 

The  triangular  region  labeled  "ratio  false  alarms"  con¬ 
sists  of  points  that  are  false  alarms  from  the  point  of  view 
of  the  2D  rule.  The  two  angular  sector  regions  labeled  "2D 
excess  false  alarms"  consist  of  points  that  are  false  alarms 
from  the  point  of  view  of  the  ratio  rule.  The  actual  false 
alarm  probabilities  are  determined  not  by  the  areas  of  these 
regions  but  by  the  result  of  integrating  the  bivariate  clutter 
PDF  over  the  regions. 

Figure  10  illustrates  graphically  several  cases  of  a  bi¬ 
variate  Gaussian  distribution.  The  solid-line  ellipse  repre¬ 
sents  a  curve  of  constant  probability  for  the  case  of  uncorre¬ 
lated  spectral  components  with  the  standard  deviation  of  one 
component  equal  to  ten  times  that  of  the  other.  Also,  the  ratio 
corresponding  to  the  mean  point  is  defined  to  be  J^/J^  and  is 
equal  to  2/3. 

The  dashed-line  ellipses  in  Fig.  10  are  lines  of  constant 
probability  for  distributions  that  are  obtained  from  the  orig¬ 
inal  distribution  by  rotating  the  coordinate  system  about  the 
mean  point  through  various  angles  as  indicated.  This  provides 
cases  in  which  the  covariance  matrix  is  not  diagonal,  i.e., 
in  which  the  spectral  components  are  correlated. 
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J  (50,75),  MEAN  COORDINATES 
CORRESPONDING  RATIO  -2/ 3 

ELLIPSES  ARE  CURVES  OF  CONSTANT  PROBABILITY 
AT  THE  LEVEL  CORRESPONDING  TO  TWICE  THE 
STANDARD  DEVIATION  IN  THE  UNCORRELATED 
POSITIONS  (l.e.,  0°  OR  90°) 

RATIO  OF  THE  LARGER  STANOARO  DEVIATION  TO  THE 
SMALLER  IS  10:1 


MMMO 


FIGURE  10.  Bivariate  probability  densities  with 

various  amounts  of  correlation  induced 
by  rotation  about  the  mean 


Figure  11  shows  curves  that  represent  .the  ratio  PDF* 
corresponding  to  each  bivariate  PDF  illustrated  in  Fig.  10. 

Note  that  each  curve  has  a  single  mode  which  occurs  near,  but 
not  exactly  at,  the  ratio  of  the  mean  components,  i.e.,  2/3. 

Also  note,  by  comparison  with  Fig.  10,  that  the  largest  mode 
occurs  for  the  case  in  which  the  major  axis  of  the  correspond¬ 
ing  bivariate  constant  probability  ellipse  is  colinear  with  the 
line  joining  the  mean  point  and  the  origin  of  the  coordinate 
system.  Further,  the  smallest  mode  occurs  for  the  case  in 
which  it  is  the  minor  axis  that  is  colinear  with  that  line. 

To  calculate  the  probability  of  false  alarm  (PFA)  for 
either  the  ratio  or  the  2D  rule,  as  observed  in  Chapter  III,  it 
is  only  necessary  to  integrate  the  clutter  PDF  over  the  appro¬ 
priate  region  for  a  bivariate  Gaussian  distribution.  The 
region  will  always  be  bounded  by  straight  lines  whenever  the 
target  and  background  covariance  matrices  are  the  same.  Accord¬ 
ing  to  the  mathematical  model  proposed  in  Chapter  II,  this  will 
generally  be  the  case  for  spectral  discrimination  of  point  tar¬ 
gets.  Appendix  D  shows  in  detail  how  such  integrals  can  be 
evaluated  efficiently. 

To  make  the  false  alarm  probability  calculation  particularly 
easy,  consider  the  simplest  possible  case.  In  which  the  target 
and  clutter  probability  distributions  are  both  bivariate  Gauss¬ 
ian  with  uncorrelated  spectral  components  having  Identical 
standard  deviations.  In  accordance  with  the  mathematical  model 
of  Chapter  II  the  standard  deviation  will  be  the  same  for  the 
target  and  clutter  distributions,  as  well.  For  this  case 
Table  1  provides  false  alarm  probabilities  due  to  the  ratio 
rule  and  to  the  2D  (bivariate)  rule  for  three  different  sets 
of  means  given  in  units  of  the  common  standard  deviation. 


Equation  D-l^  of  Appendix  D  was  used  to  plot  these  curves. 
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FIGURE  11  . 


J1 

Probability  densities  for  the  ratio  t— 
when  J]  and  J2  are  uncorrelated  at  J2 
0°  and  correlated  by  rotating  distribution 
counterclockwise  about  the  mean  point 

(j,.  J2) 
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TABLE  1.  FALSE  ALARM  PROBABILITIES  FOR  RATIO 
AND  2D  DISCRIMINATION  RULES 


ARTIFICIAL  DATA 


SPECTRAL 

BAND 

CLUTTER 

MIAN 

TARGET 

MEAN 

STANDARD 

DEVIATION 

51 

5 

TO 

I 

’i 

5 

TO 

.  1 

S 

10 

1 

6 

11 

1 

31 

5 

5 

1 

h 

5 

13 

1 

_  . 

0.5034 

Ratio 

QQjym 

2D 

0.5032 

Ratio  (excess) 

0.4948 

Ratio 

IISEH 

2D 

0.4946 

Ratio  (excess) 

imam 

HVfffHfUli 

0.0339 

Ratio 

loom 

2D 

0.0338 

Ratio  (excess) 

It  is  seen  in  the  table  that  for  the  case  in  which  the  tar¬ 
get  and  clutter  means  have  the  same  ratio  of  spectral  components, 
as  illustrated  In  Pig.  9,  the  ratio  rule  produces  a  false  alarm 
probability  that  is  more  than  50  percent,  while  the  2D  rule's 
false  alarm  probability  is  about  0.02  percent.  When  the  means 
are  shifted  slightly  so  that  the  target  and  clutter  means  are 
no  longer  associated  with  Identical  component  ratios  the  ratio 
rule  false  alarm  probability  improves  slightly  to  a  little  less 
than  50  percent  while  the  2D  rule  false  alarm  probability 
remains  essentially  the  same.  When  the  means  are  shifted  by  a 
greater  amount  so  that  the  ratio  associated  with  the  target 
mean  is  somewhat  greater  than  2-1/2  times  the  ratio  associated 
with  the  clutter  mean  the  false  alarm  probability  due  to  the 
ratio  test  improves  considerably.  However,  it  is  still  more 
than  3  percent,  while  the  false  alarm  probability  due  to  the 
2D  rule  is  less  than  0.01  percent. 

Tables  2,  3  and  4  contain  the  results  of  similar  calculations 
based  on  data  taken  from  Ref.  9  for  natural  terrain  backgrounds. 
Data  for  the  targets  were  made  up  by  using  equivalent  temperature 
means  that  are  3<r  or  5o  above  the  corresponding  background  means 
for  one  or  both  of  the  wavelength  bands.  One  scene  is  a  conifer 
forest  in  Michigan  and  the  other  is  a  mountainous  area  in 
Nevada. 

An  examination  of  the  tables  indicates  that  the  ratio  rule 
produces  consistently  higher  false  alarm  probabilities  than 
does  the  2D  rule.  In  many  of  the  examples  the  PFAs  differ  by 
an  order  of  magnitude  or  more. 


TABLE  2.  FALSE  ALARM  PROBABILITIES  FOR  RATIO 
AND  2D  DISCRIMINATION  RULES 


Background  Typer  Mountainous  Terrain  (Neills  AF  Base,  Nevada) 

Conditions:  AM  (1100,  2-26-78),  high  overcast,  light  haze,  visibility  15  ml 
Aircraft:  Altitude  1000  ft,  ground  speed  200  ft/sec,  flight  direction  East 

Area  Covered:  1750  ft  x  6750  ft;  Depression  Angle:  35  deg;  IFOV:  2.5  mrad 
Radiance  Units:  deg  k 


TABLE  3.  FALSE  ALARM  PROBABILITIES  FOR  RATIO 
AND  2D  DISCRIMINATION  RULES 


Background  Type:  Mountainous  Terrain  (Neills  AF  Base,  Nevada) 

Conditions:  AM  (0930,  2-25-78),  high  thin  scattered  clouds,  visibility  15  ml 
Aircraft:  Altitude  1750  ft,  ground  speed  200  ft/sec,  flight  direction  West 
Area  Covered:  1750  ft  x  6750  ft;  Depression  Angle:  90  deg;  IFOV:  2.5  mrad 
Radiance  Units:  deg  k 


TABLE  4.  FALSE  ALARM  PROBABILITIES  FOR  RATIO 
AND  2D  DISCRIMINATION  RULES 


Background  Type:  Conifer  Forest  (Michigan) 

Conditions:  1230  (W Inter,  4-3-79,  4-4-79),  no  clouds,  snow-covered  ground,  air 
temperature  -  2  deg  C 


Aircraft:  Altitude  1750  ft,  ground  speed  202  ft/sec,  flight  direction  NNW 

Area  Covered:  1650  ft  x  1750  ft;  Depression  Angle:  90  deg;  IFOV:  2.5  mrad 
Radiance  Units:  deg  k 


SPECTRAL 

BAND 

CLUTTER 

MEAN 

TARGET 

MEAN 

STANDARD 

DEVIATION 

CORRELATION 

COEFFICIENT 

— pn — 

0.0190 

Ratio 

Emm 

281.77 

300.115 

3.6689 

0.169 

mmm 

2D 

277.58 

280.751 

0.6341 

0.0187 

Ratio  (excess) 

mnuf 

2D  (excess) 

HSHBI 

Ratio 

IRiill 

281 .77 

300.115 

3.6689 

0.169 

mmm 

2D 

rwvwwm 

277.58 

277.58 

0.6341 

ism 

Ratio  (excess) 

mmm 

2D  (excess) 

0.3322 

Ratio 

3.6689 

0.169 

2D 

erbersmi 

277.58 

280.751 

0.6341 

0.3293 

Ratio  (excess) 

0.0027 

2D  (excess) 

0.1064 

Ratio 

3 .  )-3 . 9  u 

281.77 

292.777 

3. 6689 

0.169 

0.0250 

20 

4. 5-5. 5  u 

277.58 

279.482 

0.6341 

0.0910 

Ratio  (excess) 

0.0095 

2D  (-excess) 

0.0643 

Ratio 

3. 5-3. 9  u 

281.77 

292.777 

3.6  689 

0.169 

0.0641 

2D 

HHCTlfB 

277.58 

277.58 

0. 6341 

rfli'kji 

Ratio  (excess) 

4x1 0"4 

2D  (excess) 

0.4058 

Ratio 

iisffli 

281.  77 

281  .77 

3.6689 

0.169 

0.0643 

2D 

277.58 

279.482 

0.6341 

0.3710 

Ratio  (excess) 

0.0295 

2D  (excess) 

V.  SUMMARY  AND  CONCLUSIONS 


A.  SUMMARY  OF  TOPICS  COVERED 

Based  on  the  assumption  that  IR  measurement  data  separated 
into  N  distinct  channels  have  an  N-variate  Gaussian  probability 
distribution,  this  paper  formulates  a  mathematical  model  for  the 
background  radiance  in  the  presence  and  in  the  absence  of  targets. 
The  formulation  includes  both  spectral  and  spatial  discriminants 
for  the  case  of  point  targets. 

According  to  the  model  as  proposed,  if  the  N  data  channels 
are  defined  as  spectral  bands  it  is  usually  the  case  that  the 
probability  distributions  associated  with  the  presence  of  a 
target  and  with  the  absence  of  any  target  differ  significantly 
only  in  their  N  dimensional  mean  vectors.  That  is,  their  N 
by  N  covariance  matrices  are  assumed  to  be  nearly  identical. 

This  will  be  true  as  long  as  the  target  occults  only  a  small 
fraction  of  the  sensor's  footprint. 

On  the  other  hand,  if  the  N  data  channels  are  defined  in 
terms  of  the  spatial  discriminant,  i.e.,  so  that  each  channel 
represents  the  radiance  level  at  a  single  pixel  in  an  N  pixel 
window,  the  covariance  matrices  associated  with  the  presence 
or  absence  of  a  target  will  differ  unless  the  target  occupies 
just  a  small  fraction  of  a  pixel.  In  fact,  the  model  assumes 
an  explicit  form  (Chapter  II,  Section  E)  for  the  difference  of 
the  two  matrices  when  the  target  exactly  fills  a  single  pixel. 

For  certain  calculations  it  Is  convenient  tc  change  coor¬ 
dinates  by  means  of  a  principal  axis  transformation  relative  tc 
the  covariance  matrix  associated  with  an  N-variate  Gaussian 
probability  distribution.  Appendix  C  shows  in  detail  how  this 
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can  be  done  simultaneously  for  the  two  covariance  matrices  asso¬ 
ciated  with  spatial  data  channels  in  the  presence  and  in  the 
absence  of  a  target. 

The  purpose  of  the  model  is  to  provide  .a  means  for  obtain¬ 
ing  rc  ugh  evaluations  of  proposed  target  discrimination  schemes 
on  the  basis  of  what  may  be  regarded  as  a  minimal  acceptance 
standard.  The  analysis  (Chapter  III)  covers  optimum  CFAR  dis¬ 
crimination  and  also  Includes  a  consideration  of  the  effective¬ 
ness  of  tracking  algorithms  (Chapter  III,  Section  C)  after  CFAR 
discrimination  algorithms  have  been  applied. 

In  addition  to  these  topics  and  some  related  detail  on 
linear  filtering  (Chapter  III,  Section  B)  and  how  to  calculate 
various  quantities  of  interest,  this  paper  also  deals  (Chapter 
IV)  with  two  special  subjects.  One  is  a  method  of  adaptive 
thresholding  known  as  background  normalization.  The  other  is 
the  question  of  whether  it  is  useful  or  harmful  for  multi¬ 
color  systems  to  use  ratios  of  spectral  components,  rather  than 
the  components  themselves,  in  target  discrimination  processing. 

The  ordinary  background  normalization  process  amounts  to 
a  linear  least-square-error  interpolation  of  local  measurement 
data  to  predict  the  value  of  the  background  radiance,  or  some 
function  of  the  radiance,  in  a  given  direction  in  the  absence 
of  a  target.  This  paper  shows  hew  to  extend  the  interpolation 
to  include  terms  of  quadratic  order  by  means  of  a  special  linear 
filter.  As  an  example,  the  weights  that  define  the  filter  mask 
for  the  case  of  a  5  fcy  5  pixel  window  are  calculated. 

The  analysis  required  for  the  multi-color  question  involves 
a  calculation  of  the  probability  distribution  for  spectral  com¬ 
ponent,  ratios  .  Properties  of  the  corresponding  probability  den¬ 
sity  for  the  two-color  case  are  discussed  in  detail. 

Chapter  IV  contains  tables  of  false  alarm  probability  cal¬ 
culations  for  the  two-color  case  for  two  discrimination  algorithms 
one  based  on  the  one-dimensional  distribution  for  the  ratio  of 


the  two  spectral  components  and  the  other  on  the  two-dimensional 
bivariate  Gaussian  distribution  for  the  components  themselves. 

The  tables  answer  the  question  concerning  the  relative  merit  of 
the  two  approaches.  Most  of  the  data  used  for  the  calculations 
are  taken  from  Ref.  9,  which  provides  results  of  radiance  measure¬ 
ments  over  several  wavelength  bands  for  a  variety  of  terrain 
backgrounds . 

B.  CONCLUSIONS 

(1)  Experimental  data  (Ref.  9)  for  a  variety  of  terrain 
backgrounds,  especially  those  unaffected  by  human  intervention, 
exhibit  radiance  distributions  that  are  well  approximated  (out 
to  2 o,  3<o  or  more)  by  Gaussian  probability  density  functions. 

This  may  be  adequate  for  realistically  estimating  the  effect  of 
preliminary  detection  algorithms  for  which  the  false  alarm  rate 
requirements  are  relatively  modest.  However,  for  some  scenes 
that  have  been  affected  by  human  intervention,  notably  A.?.  Hill, 
Virginia  and  Baltimore,  Maryland,  the  approximation  is  poor. 

At  any  rate,  the  assumption  of  a  Gaussian  distributed  background 
provides  a  minimal  standard  against  which  to  measure  an  algo¬ 
rithm's  clutter  rejection  performance. 

(2)  The  data  in  Ref.  9,  provided  by  the  Environmental 
Research  Institute  of  Michigan  (ERIM),  is  presented  in  a  form 
that  Is  well-suited  to  mathematical  modeling  of  the  spectral 
distribution  of  terrain  background  radiance.  It  Is  also  useful 
for  testing  hypotheses  concerning  the  spatial  distribution  of 
the  radiance.  Data  for  other  types  of  background,  e.g.,  clouds, 
would  be  similarly  useful  if  gathered  and  presented  in  the  same 
form . 

(3)  The  ERIM  data  supports  the  assumption  •  :  at  a  r'-tural 

(e.g.,  a  conifer  forest)  scene  is  spatially  ..sotr  .  t 

the  assumption  that  the  cross-correlation  furotiw  evp  :  rt--- 
tial.  In  fact,  the  Wiener  spectra  given  in  r  *  :r  a  _c:.ifer 

forest  are  not  consistent  with  any  simple  power  law  generalization 
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of  the  spectrum  associated  with  an  exponential  cross-correlation 
function.  The  isotropic  character  of  the  spatial  distribution 
obtained  for  a  conifer  forest  in  the  ERIM  data  contradicts  a 
model  that  is  sometimes  assumed  (cf.  Ref.  10)  for  the  cross¬ 
correlation. 

(4)  If  the  distribution  of  IR  background  radiance  over 

N  channels  (spectral,  spatial  or  temporal)  is  N-variate  Gaussian 
when  targets  are  present  and  when  they  are  not,  the  optimum  CPAR 
target  discrimination  criterion  is  an  inequality  involving  a 
quadratic  function  of  the  measured  data  unless  the  covariance 
matrix  for  the  background  in  the  absence  of  any  target  is  iden¬ 
tical  with  that  for  the  background  when  a  target  is  present. 

(5)  When  the  two  covariance  matrices  are  identical  the 
optimum  discrimination  algorithm  is  equivalent  to  applying  a 
linear  digital  filter  and  then  thresholding.  This  optimum 
linear  filter  is  the  same  as  the  well-known  filter  that  maxi¬ 
mizes  signal-to-noise  if  the  signal  power  is  Identified  with 
the  square  of  the  difference  between  the  mean  target  and  mean 
background  signals  and  the  noise  power  Is  Identified  with  the 
variance  of  the  background  distribution. 

(6)  For  spectral  channels  the  two  covariance  matrices 
approach  equality  when  the  target  occults  a  small  fraction  of 
the  'sensor's  footprint,  as  is  usually  the  case.  For  spatial 
channels  the  two  covariance  matrices  approach  equality  when  the 
target  occupies  just  a  small  fraction  of  a  pixel.  The  optimum 
filter  will  be  approximately  linear  if  this  happens,  particularly 
if  the  mean  target  signal  differs  from  the  mean  background 
signal  by  several  standard  deviations.  When  the  target  size  is 
of  the  order  of  a  pixel  the  spatial  covariance  matrices  will 
differ,  and  the  optimum  spatial  filter  will  not  be  linear. 

(7)  If  a  tracking  algorithm  is  used  for  the  final  decision 
whether  a  target  is  or  is  not  present  after  the  application  of 
one  or  more  preliminary  CFAR  detection  algorithms  has  eliminated 


most  of  the  candidate  detections,  the  effectiveness  of  the  track¬ 
ing  algorithm  will  depend  upon  the  effectiveness  of  the  prelim¬ 
inary  algorithms.  In  fact,  unless  the  false  alarm  probability 
after  the  preliminary  detection  phase  is  below  a  certain  critical 
value,  no  tracking  algorithm  can  satisfy  a  given  false  alarm 
rate  requirement.  Moreover,  the  sensitivity  of  a  tracking  algo¬ 
rithm  to  error  or  to  unpredicted  target  accelerations  will 
increase  rapidly  with  an  increase  in  the  false  alarm  probability 
for  the  preliminary  detection  phase. 

(8)  When  ratios  of  spectral  components  are  used  by  multi¬ 
color  systems  to  discriminate  between  targets  and  background 
rather  than  the  components,  themselves,  the  discrimination 
algorithm  will  be  less  effective.  In  particular,  for  the  two- 
color  case  applied  to  typical  natural  background  data  obtained 
by  ERIM  (Ref.  9),  when  an  algorithm  based  on  the  ratio  of  the 
two  spectral  components  is  used  instead  of  one  based  on  the  two- 
dimensional  distribution  of  the  components  the  calculated  false 
alarm  probability  is  consistently  larger,  often  by  an  order  of 
magnitude  or  more. 
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APPENDIX  A 


OPTIMUM  CFAR  DISCRIMINATION 

A.  THE  GENERAL  CASE 

It  will  be  assumed  that  there  are  N  data  channels,  each 
providing  a  radiance  measurement  proportional  to  a  signal  J^, 
i  =  1,  . . . ,  N.  The  will  be  regarded  as  the  components  of  a 
vector  H  and  as  coordinates  of  a  point  in  an  N  dimensional  space. 

It  will  also  be  assumed  that  an  admissible  discrimination 
process  will  determine  a  region  Rrj,  in  the  data  space  such  that 
all  measurement  sets  representing  points  in  RT  will  be  regarded 
as  due  to  a  target  source  and  all  other  measurements  as  due  to 
clutter.  Further,  it  will  be  assumed  that  there  is  a  function 
41  (ji)  and  a  a.uantity  t  such  that  the  region  RT  consists  of  points 
J  that  satisfy  the  inequality 

|J>(J)  <  T  .  ( A-l ) 

Suppose  that  there  is  a  joint  probability  distribution  for 
the  components  of  J,  conditioned  on  the  presence  of  a  target, 
and  an  associated  probability  density  Pm  (J).  Suppose  also  that 
the  complementary  joint  probability  distribution  conditioned  on 
the  absence  of  a  target  has  the  density  P£  (J).  Then  the  false 
alarm  probability  will  be  given  by 


where  the  notation  is  understood  to  indicate  a  volume  integral 
over  the  n  dimensional  region  RT.  Similarly,  the  probability 
that  a  target  will  be  detected  if  it  is  present  is  giver,  by 


PTD  = 


(A-3) 


For  a  constant  false  alarm  rate  (CFAR)  it  is  necessary  to 
choose  the  region  R^  sc  that  PFA,  given  by  (A-2),  is  equal  to 
some  prescribed  constant  <$.  Then  the  optimum  discrimination  be¬ 
tween  targets  and  clutter  will  occur  when  PTD  given  by  (A-3)  is 
maximized  subject  to  the  condition  that  PFA  is  equal  to  <j>. 

That  is,  the  problem  is  to  choose  the  function  Mg)  so  as 
to  maximize  PTD,  the  choice  being  restricted  to  those  functions 
for  which  PFA  is  equal  to  <p .  This  leads  to  the  variational 
equation 


5  [PTD  +  X  U-PFA)]  =  o  ,  (A-4) 

subject  to  the  condition 


PFA  =  <J> 


(A-5) 


where  X  is  the  usual  Lagrange  multiplier  and  the  variation  is 
taken  with  respect  to  ip ( R ) . 

The  variation  calculated  by  means  of  the  standard  procedure 
in  the  calculus  of  variations  after  substituting  from  (A-l), 
(A-2)  and  (A-3)  leads  to  the  equation 


f 

B„ 

J. 


x  P_(  J)  ]  5iKJ)  dJ  *  o 

(J  ***  ***  *** 


.  (A-6) 


A-2 


where  the  integral  is  taken  over 
by 


the  hypersurface  BT 


determined 


#(£)  =  f 


( A— 7 ) 


The  equation  (A— 6 )  must  hold  for  all  admissible  functions 
Sip(J);  hence,  in  accordance  with  the  standard  argument, 

PT(J)  =  X  PC(J)  (A-8) 

for  all  points  satisfying  ( A— 7 ) . 

The  equation  (A-8)  may  be  regarded  as  equivalent  to  (A-7); 
hence  the  relations 


*(J)  ■  log  PT(J)  *  log  PC(J) 
t  =  log  X 


(A-9) 


provide  a  satisfactory  solution  of  the  variational  problem. 

The  constant  X,  and  therefore  x,  can  then  be  determined  by  solv¬ 
ing  the  equation  (A-5)  after  substituting  from  (A-2)  and  (A-9). 

B.  THE  N- VARIATE  GAUSSIAN  DISTRIBUTION 

An  important  special  case,  which  often  has  at  least  an 
approximate  validity,  is  the  case  in  which  P^J)  and  Pq(J)  are 
both  N-variate  Gaussian  probability  densities,  having  the  form 


P(J)  = 


-  5  u-i)6  r1  (j-j) 


(A-10 ) 


A-3 


where  J  is  the  mean  vector  defined  by 


2  *  fji  p <i>  « 

00 


(A-11) 


M 


is  the  covariance  matrix  whose  elements  M 


U 


are  defined  by 


(A-12 ) 


|M|  is  the  determinant  of  M,  and  the  superscript  t  in  (A-10) 

t w 

indicates  the  transpose  (row)  vector.  The  symbol  00  in  (A-il) 
and  (A-12)  means  that  the  integration  region  for  the  integrals 
so  labeled  is  all  n  space.  The  densities  PT(J)  and  P^J)  are 
determined  completely  by  their  respective  means  J™,  and  and 

their  covariance  matrices  M„  and  in  accordance  with  the  form 

,  v  ~T 
(A-10). 

According  to  (A-10),  (A-9)  ana  (A-l)  the  optimum  CFAR  al¬ 
gorithm  for  declaring  a  target  detection  when  an  M  channel 
measurement  set  consists  of  the  components  of  J  is  the  rule: 
for  a  prescribed  false  alarm  probability  4>  a  target  is  present  if 

(ji-Ir)1  gr'1  (£-£T>  ‘  y-Ic/  go'1  y  •  u-13) 

where  y  is  a  constant  determined  by  the  condition 

I  gcf1 

dJ  *  <f>  .  (A-14 ) 


A— 4 


The  Integration  region  H(y)  in  (A-lA)  consists  of  points  £ 
that  satisfy  (A-13). 


The  left  side  of  (A-13)  is  the  difference  between  two  quad¬ 
ratic  forms  In  the  quantities  and  It  can  be  simpli¬ 

fied  somewhat  by  a  small  amount  of  algebraic  manipulation  which 
will  reduce  it  to  the  sum  of  a  quadratic  form  in  J,  a  linear 
form  in  £  and  a  constant.  In  fact,  (A-13)  can  be  written 

£*  AM  £  -  2  L*  J  <  k  ,  (A-15) 

where  AM  is  a  matrix  given  by 

AH  •  At'1  -  &'1  . 

is  a  vector  given  by 

fe  -  ij  Hr"1  -  Ho'1  .  U-17) 

ana  k  is  a  constant  given  by 

«  =  y  -  iTz  JUt-1  +  ^c'1  Ic  •*  (A-l8) 

In  deriving  (A-13)  the  fact  that  the  covariance  matrices,  and 
therefore  their  inverses,  are  symmetric  Is  used. 

If  AJjJ  is  not  zero  the  region  in  n  dimensional  space  defined 

r*m 

by  (A-15)  is  bounded  by  the  n  dimensional  version  of  a  quadric 
surface.  If  n  is  2  the  boundary  Is  a  conic  section,  i.e.,  an 
ellipse,  a  parabola  or  a  hyperbola. 


If  the  region  Rt  is  prescribed  by  the  form  (A-13)  then  it  is 
the  constant  <  that  must  be  determined  by  (A-lt). 


If  the  target  and  clutter  covariance  matrices  are  identical, 
however,  AM  is  zero.  In  that  case  the  region  defined  by  (A-15) 
is  a  half  space  bounded  by  a  hyperplane,  i.e.,  the  n  dimensional 
version  of  a  plane. 

C.  THE  UNIVARIATE  GAUSSI*.,  DISTRIBUTION 

As  an  example  of  how  the  pcimum  discrimination  algorithm 
can  be  formulated  in  practice  it  may  be  useful  to  consider  a 
special  case  in  detail.  The  univariate  Gaussian  distribution  is 
obviously  the  simplest  special  case.  It  is  also  a  useful  one 
to  consider  because  it  plays  a  fundamental  role  in  the  construc¬ 
tion  of  optimum  linear  filters. 

When  the  probability  distribution  is  univariate  the  mean 

vector  and  covariance  matrix  are  actually  scalar  quantities. 

Thus,  the  means  associated  with  the  target  and  clutter  distri- 

-  2  2 

butions  are  constants  JT  and  Jc,  and  variances  oT  and  , 

both  of  which  are  also  constant,  replace  the  covariance  matrices 

M™  and  M„. 

0*>0 

Then  (A-15)  becomes 

J2  -2  LJ  <  <  ,  (A-19) 

°C  / 

where,  because  of  (A-17), 

L  =  X 
aT 

After  a  substitution  from  (A-20)  the  relation  (A-19)  may  be 
replaced  by 

J2-2BJ-y<0  ,  ( A-21 ) 


(A-20) 


o  „ 


A-6 


where 


8 


(A-22 ) 


and  y  is  a  new  constant,  replacing  k,  to  be  determined  by  (A-14). 
It  follows  from  (A-21)  that 


6  -\l7+y  <  J  <8  +V  82+y  •  ( A— 2  3 ) 

The  interval  defined  by  (A-23)  is  the  one  dimensional  version  of 
the  region  that  was  labeled  in  the  general  case  and  R(y)  in 
the  n-variate  Gaussian  case  discussed  in  Sections  A  and  E  of 
this  appendix.  That  is,  (A-23)  gives  the  criterion  for  declar¬ 
ing  that  the  measurement  J  is  due  to  a  target. 

However,  before  the  criterion  (A-23)  can  be  used  It  is  still 
necessary  to  determine  the  constant  y.  This  can  be  done  by  using 
the  CFAR  condition  (A-14),  which  takes  the  form 


S-Ve^Y 


A-7 


or,  equivalently  because  of  (A-22), 


where 


and 


(A-24) 


( A—  2  6 ) 


Actually,  the  simplest  procedure  now  is  to  determine  p 
from  (A-2M  ar.d  (A-25)  in  terms  of  the  CFAR  value  <p .  Then, 
instead  of  using  (A-23),  the  interval  that  contains  the  target 
declarations  can  be  expressed  in  terms  of  u  and  S,  which  is  de¬ 
fined  in  terms  of  th?  given  probability  distribution  parameters 
by  (A-22);  i.e.,  a  target  is  declared  if 

B-pa„ <  J  <8+pc^  .  ( A— 2 7 ) 

0  ^ 


0 
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Prom  the  more  general  definitions  (A— 3 )  and  (A-10)  it 
follows  that 


With  the  aid  of  (A-22)  this  can  be  written  somewhat  more  con¬ 
veniently  as 


where 


and 


l 


6 


(A-28) 


(A-29) 


(A-30) 


A-9 


Given  a  required  PFA  value  $,  with  parameters  a  defined  by 
(A-25),  6  defined  by  (A-22),  and  y  determined  by  the  equation 
(A-24),  the  optimum  criterion  for  a  target  declaration  in  the 
sense  of  maximizing  the  PTD  is  given  by  (A-27).  The  correspond- 

A 

ing  PTD  is  given  by  (A-28)  in  terms  of  quantities  y  defined  by 
(A-29)  and  0  defined  by  (A-30). 

If  aT  *  ac  then  (A-19)  becomes  the  trivial  condition  that 
the  interval  of  possible  values  of  J  be  divided  into  two  com¬ 
plementary  sub-intervals.  That  is,  a  constant  t  divides  the 
interval 

—  00  <  J  <00 


into  two  intervals 

-oq<J<x,T<J<00> 

one  of  which,  it  will  be  assumed  by  the  target  discrimination 
rule,  contains  all  values  of  J;  and  only  those  values,  that  may 
be  attributed  to  a  target  source. 

If  AJ'  >  0  the  value  of  x  is  to  be  determined  by  the.CFAR 
condition 


♦  - 


2o2 


dJ 


x 

2 


dx  , 


T-J, 


A-10 


i 


That  is,  for  v  such 


where  o  is  the  common  value  of  oc  and  o 
that 


v 


t  will  be  determined  by 

t  ■  av  +  Jc 

Then  a  target  is  declared  whenever 


(A-31) 


(A-32) 


J  >  T 

If  AJ  <0  then  v  is  still  defined  by  (A-31)  but  (A-32) 
must  be  replaced  by 

t  *  -  ov  +  Jc  .  (A-33) 

Then  a  target  is  declared  if 


J  <  r 

In  either  case  the  FTD  can  be  calculated  by  integrating 


T  >2 


PT(J) 


(J-JT) 

2? 


/!¥ 


A-ll 


over  the  interval  in  which  ta 
is  given  by 
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POWER  SPECTRAL  DENSITY  CALCULATIONS 

Chapter  II  discusses  two  correlation  functions  that  are 
sometimes  suggested  as  models  for  the  spatial  distribution  of 
background  clutter.  One  is 

,  (B-l) 

which  is  anisotropic  even  when  L  ■  L  .  The  other  is 

x  y 

/  | r-r ' | \ 

K(r  ,r')  =  exp  [-  )  ,  (B-2) 

which  is  isotropic. 

The  power  spectral  density  (or  Wiener  spectrum)  for 

K  '  (r.r')  is  given  by  the  Fourier  transform 

x,y 


K 


x,y 


( r ,  r  • ) 

A.  IV 


exp 


lx-x'  I  __  1  y- 


T - T"  +  ^ 

x  y 


dudv, 


(B-3) 


where  k  is  the  wave  number  vector  with  components  (k  ,  k  )  and 

^  y 

u  is  the  displacement  vector  with  components 

u  ■  x-x' ,  v  ■  y-y' 


B-l 


Since 


£  “  kx  u  +  ky  v  » 

the  double  integral  in  (B-3)  is  a  product  of  two  single  inte¬ 
grals  that  are  easy  to  evaluate  individually.  The  result  of 
the  evaluation  is 


VL  v  (k> 
x,y 


lkxLx 


1+ikxLx 


■)(= 


lkyLy 


4l  L 

(1+kx  ^>X<l+ky 


\ 

1+1W 


(B-4) 


The  power  spectral  density  for  K  (r,r’)  is  given  by  the 
Fourier  transform 


(B-5) 


which,  after  changing  variables  to  polar  coordinates  (P,  o), 
becomes 

.  r  2tt 

/-  £  /  ikpcos(0-$) 

e  J  e  d0pdp  ,  (B-6) 

O  0 

where  (k,  $)  are  the  polar  coordinates  of  the  vector  jc.  The 
inner,  angular,  integral  is  independent  of  $  because  the 


V.' 


B-2 


integration  interval  is  exactly  one  period  of  the  integrand 
which,  of  course,  is  periodic  in  0;  therefore,  can  be  set 
equal  to  zero.  The  integral  over  0  can  then  be  recognized  as 
a  well  known  representation  for  the  Bessel  function  of  order 
zero.  Thus,  (B-6)  may  be  written 


W(k) 


•-! 

e  JQ(kp)  pdp 


With  the  aid  of  a  standard  table  of  integrals  (e.g..  Ref.  B-l, 
p.  712)  this  can  be  recognized  as  equivalent  to 


W(k) 


2irL2 

(1^2L2)^ 


(B-7) 


As  Ref.  B-2  points  out,  if  the  background  distribution 
appears  to  be  isotropic  when  viewed  at  a  90  deg  depression 
angle  (i.e.,  vertically),  when  it  is  viewed  at  an  angle  that 
is  less  than  90  deg  it  should  appear  to  be  anisotropic.  This 
is  because  there  will  be  an  apparent  change  of  scale  in  the 
cross-track  but  not  in  the  in-track  direction.  In  such  a  case 
the  natural  generalization  of  the  correlation  function  K(r,r') 
given  by  (B-2)  is  a  function  of  the  form 


Ky(£>£') 


exp 


- 1  ^ 


)2+a5(y-y' 


- 

r 


( E— 8 ) 


B-3 


■V  ~  -~~T* 


f?  V 


The  corresponding  power  spectral  density  will  be  given  by 


V-k> 


~>  '/  /  eXP  t1-'-  “  E  V“" 


+  a 


T~ T 


dudv 


which,  after  a  change  of  variable,  becomes 


W  (k) 
y  ~ 


(B-9) 


where  k  is  the  vector  with  components  ^kx,  -£-}  and  u  is  the 
vector  with  components  (u,  ct  v).  Since  (B-9)  has  exactly  the 
form  of  (B-5)  its  value  can  be  obtained  by  inspection  of  (B-7). 
Thus, 


WyCk! 


2-ra2L2 

*  aW  *  OT75 

x  y 


(B-10) 


Isotropic  background  distributions  with  Wiener  spectra  for 
which  the  functional  forms  differ  from  that  in  (B-7)  apparently 
occur  more  frequently  than  not.  A  natural  generalization  of 
the  Wiener  spectrum  given  by  (B-7)  that  produces  an  infinite 
class  of  possible  correlation  functions  can  be  obtained  by  re¬ 
placing  the  exponent  ^  in  the  denominator  on  the  right  side 
with  any  positive  number  v.  The  corresponding  correlation 
function  would  then  be  given  by 


B-U 


00 


2  f  f  exp  C-ik*  (r-r  * )  ] 

K(r,r-  v)  -  L2y  J  - — -  dk„  dk. 


(1+k  L  ) 


x  y 


(B-ll ) 


2  [  Jo(kl£“£ 

L  /  — - g— g-rp  kdk  . 

J  (i+k2ir)v 


The  integral  in  (B-ll)  can  be  evaluated 
a  formula  on  p.  488  of  Ref.  B-3.  The  result 


with  the  aid  of 
is 


K(r ,r ' ;v) 


(B-12) 


where  the  function  on  the  right  having  the  form  K_(x)  is  a 

*  4  A 

modified  Bessel  function.  The  modified  Bessel  function  K  (x) 

n 

can  be  expressed  in  terms  of  elementary  functions  when  the  order 
n  is  an  odd  multiple  of  |  (cf.  Ref.  B-3  ,  p.  444);  e.g.. 


K  X(x)  »  K-^x)  *  e“x 

“  2  2 


K  1(X)  "  KIU)  *  (1  + 

2  2 


B-5 
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CANONICAL  VARIABLES  FOR  SPATIAL  CHANNELS 


The  mathematical  model  for  spatial  channels  that  was  intro¬ 
duced  in  Chapter  II  assumes  a  target  covariance  matrix 
given  by 

Mr  -  M  -  AJ5  >  (C-l) 


where  M  has  the  elements  M. ,  and  AK  has  the  elements  A4  , 

lj  /v  —  0 

given  by 


U 


*  M 


jo  io 


+  M 


io  ujo 


-  M 


oo  io 


^jo 


-  On 


'Jo 


(C-2 ) 


In  (C-2)  and  throughout  this  Appendix  the  subscripts  are  under¬ 
stood  to  be  two  component  vectors,  and  quantities  are 
Kronecker  deltas  which  are  equal  to  one  if  the  subscript  vec¬ 
tors  i  and  k  are  identical  in  both  of  their  components  but  are 
otherwise  zero.  A  zero  subscript  represents  the  zero  vector, 
both  of  whose  components  are  zero.  Any  sum  that  is  indicated 
over  a  subscript  will  mean  that  a  double  sum  is  to  be  taken 
independently  over  both  components  of  the  subscript  vector. 

In  the  case  of  an  n  by  n  pixel  window  there  will  be  nL 

channels,  one  for  every  pixel.  Each  of  the  subscript  vector 

components  independently  takes  on  n  values,  so  that  the  vector, 

2 

itself,  takes  on  n  values,  one  for  every  channel. 

As  observed  in  Chapter  III,  for  optimum  CFAR  discrimination 
in  general  it  is  sometimes  useful  to  transform  the  variables 

C-l 


associated  with  the  natural  channels  to  new  variables  in  terns 
of  which  the  target  and  cli  *er  covariance  matrices  are  both 
diagonal.  This  can  be  done  by  solving  the  eigenvalue  problem* 


X 

m 


Y 


0 


( C— 3 ) 


The  purpose  of  this  Appendix  is  to  derive  algorithms  for  calcu- 
2 

lating  the  n  eigenvalues  Am  and  eigenvectors  Y^  for  the  spatial 
channel  model  defined  by  (C-l)  and  (C-2)  in  the  case  of  an  n  by 
n  pixel  window. 

First  of  all,  it  is  evident  from  (C-l)  and  (C-3)  that  any 
set  of  linearly  independent  vectors  Y  that  satisfy 


AM  Y  =  0  (C-A) 

~  ~m 

will  be  a  set  of  distinct  eigenvectors  associated  with  the 
common  eigenvalue 


In  fact,  by  using  (C-2)  explicitly  in  (C-M  one  finds  that  any 
vector  with  components  Y^  that  satisfy  the  equations 

Yo  "  0  > 

(C-5) 

E  Mko  Yk  -  0 

k 


will  be  such  an  eigenvector.  For  an  n  by  n  pixel  window  the 

2 

sum  in  (C-5)  contains  at  most  n  -1  non-zero  terms  since  the  term 
corresponding  to  k  *  o  vanishes. 

*Cf .  Ref.  (C-l) ,  pp .  37-^1 . 


C-2 


If  it  is  assumed  that  the  clutter  probability  distribution 

is  non-degenerate  the  covariance  matrix  is  non-singular. 

2  ~  * 

Then  the  n  column  vectors  of  Mp  are  linearly  independent. 

52° 

Prom  the  column  vectors  JjJ.  ,  i  ?  o,  whose  components  are 

x2 

Mki,  if  there  are  not  already  n  -2  values  of  i  for  which 
is  zero,  it  is  possible  to  form  n^-2  new  linearly  independent 
vectors  V,  by  defining  the  components  of  V.  by 


vki  •  Mkl  -  FTJ  "w  W  •  (c-6) 

where  j  is  any  fixed  subscript  vector  such  that  is  not  zero. 

The  vectors  V.  defined  by  ( C— 6 )  all  have  the  component  V  ,  equal 

o  — 

to  zero.  Together  with  the  vector  whose  components  are  M^o 

for  k^o  and  zero  in  place  of  M  ,  the  vectors  V.  form  a  set  of 
2  co  z. 

n  -1  linearly  independent  vectors,  all  having  zero  for  the  com¬ 
ponent  labeled  with  the  subscript  o. 

Applying  the  Gram-Schmidt  orthogonalization  process*  to 

2  2 
the  n  -1  vectors  ^  leads  to  a  set  of  n  -2  orthogonal  vectors 

Y  ,  each  of  whose  components  provide  a  different  solution  of 

**“  2 
the  equations  (C-5).  These  vectors  Y^  are  therefore  n  -2 

linearly  Independent  eigenvectors  of  (C-3),  all  corresponding 

to  the  same  eigenvalue,  one. 

The  Gram-Schmidt  process  is  equivalent  to  a.  recursive  al¬ 
gorithm  that  is  computationally  efficient  and  easy  to  implement. 
Given  a  set  of  N  linearly  independent  vectors  X  and  a  defined 
inner  product  (JJ,  V)  for  any  pair  of  vectors  U  and  V,  the  fol¬ 
lowing  recursion  relation  generates  a  set  of  N  vectors  Yn  that 
are  mutually  orthogonal  with  respect  to  the  inner  product  and 

such  that  Y  is  equal  to  the  vector  identified  as  X^  In  the 
ivo  ~o 

original  set: 


Cf.  Ref.  (C-2),  p 


230. 
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Y  -  Z 
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-  X 


"•O 


m-1 


2m 


*m  -  E  (*m>  2v;  2v>  “-1.  •••*  N  (c-7> 


v»0 
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,  m»l,  ...»  N-l  . 


To  calculate  the  eigenvectors  of  (C-3)  it  is  only  necessary 

to  identify  X  with  the  vector  whose  components  are  defined  to 

p 

be  Mko,  to  identify  the  number  N  with  n  -1,  and  to  define 
the  inner  product  as  the  usual  scalar  product  of  two  vectors; 
i.e.,  in  terms  of  column  vectors  C  and  K  the  inner  product  will 
be  defined  by 


(C,  K) 


N-l 


CtK  *  V 


C  K 
v  v 


v=o 


CC-8) 


where  the  superscript  t  means  "transpose". 

The  eigenvalue  equation  (C-3)  has  two  additional  eigenvalues 

for  a  that  are  not  ec.ual  to  one,  with  corresponding  eigenvec- 
m  2 

tors  U  and  V.  Together  with  the  n  -2  vectors  Y  .  m^c,  U  and  V 
complete  a  set  of  n^  linearly  independent  eigenvectors  that 
satisfy  (C-3)  when  the  corresponding  eigenvalues  are  used  for 

V 

To  deal  with  the  problem  of  finding  the  two  new  eigenvalues 
and  eigenvectors  it  is  convenient  to  define  a  new  inner  product 

fcy 


(£,  K)  -  M„  K 

^  ^  #W-.' 


'C-9) 


C-4 


which  is  a  bilinear  form  relative  to  the  covariance  matrix  Me. 
Since  must  be  positive  definite  the  inner  product  defined 
by  (C— 9 )  has  all  of  the  properties  that  are  necessary  for  the 
inner  product  operation.  In  particular,  it  follows  from  the 
standard  argument*  that,  relative  to  the  new  inner  product, 

U  and  V  will  each  be  orthogonal  to  all  of  the  eigenvectors  Ym 
and  to  each  other,  i.e.. 


<s.  V)  -  (2,  £m>  -  (V,  lm)  -  0,  m*) 


(0-10) 


The  orthogonality  of  two  different  eigenvectors  in  the 

sense  of  (C-10)  in  terms  of  the  inner  product  defined  by  (C-9) 

depends  upon  the  corresponding  eigenvalues  being  different. 

2 

Therefore,  it  does  not  necessarily  hold  among  the  first  n  -2 
eigenvectors  Ym. 

The  orthogonality  property  (C-10)  can  be  used  to  find  JJ 
and  V  and  the  corresponding  eigenvalues.  First,  it  is  necessary 
to  define  two  vectors  P  and  Q  which  are  orthogonal  to  the  Ym 

~  n.  p 

and  to  each  other.  Then  P,  Q  and  the  Ym  form  a  set  of  n  linearly 
independent  vectors  which  span  the  n  dimensional  vector  space. 

It  will  be  found  that  suitable  candidates  for  P  and  Q  are 
the  vectors  whose  components  are  given  by 


P1  -  Slo  > 


«1  ■  Mlo  -  S  Sio 


-  ?(' 


”ko  Mkl  ?lv)  Xlv  , 


(C-11) 


where 


3  ■  irrZ  nlc 


Cf.  Ref.  (C-l),  pp. 


and  the  are  the  components  of  the  vectors  Y^.  To  verify 
that  P  and  Q  whose  components  are  defined  by  (C-ll)  satisfy  the 
orthogonality  conditions  (C-10)  when  substituted  for  U  and  V  it 
is  only  necessary  to  substitute  from  (C-ll)  into  (C-10)  and 
make  use  of  (C-5)  and  the  symmetry  of  the  covariance  matrix  M„. 

It  follows  from  the  orthogonality  property  (C-10)  of  the 
eigenvectors  that  £  and  V  must  belong  to  the  subspace  spanned 
by  P  and  Q.  That  is.  if  P  is  not  already  an  eigenvector,  as  is 
usually  the  case,  then  either  eigenvector  can  be  written  in 
the  form 


U  -  a  P  +  Q  , 


(C-12 ) 


where  a  is  a  scalar  constant  to  be  determined. 

Because  of  (C-l)  the  eigenvalue  equation  (C—  3 )  can  be 
written 


U-X)  ot  X4MX  -  0 


(C-13) 


If  U  given  by  (C-12)  is  substituted  for  Y  In  (C-13)  and  the 

**  **  t  t 

resulting  equation  is  multiplied  on  the  left  by  P  or  by  Q 

the  first  or  the  second  of  the  two  scalar  equations 


(i-x)  <aiii;L  +  ii12)  +  x(ar11  +  rip)  -  0  , 


11  12 


(C-14) 


(l-x)  (aii12  +  n22)  +  x(ar12  +  r22)  -  o 


results,  where 

nll  ■  1*  &E-  JI12  ■  Zt  &*•  “22  -  a*  ScS  ■ 


■'ll  ■  £*  rl2  ■  f  "!S-  r22  ■  at£g  s  • 
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By  eliminating  a  from  (C-lk) ,  using 


(x-i)  (nj2  -  nn  n22)  +  x  (ii-L1  r22  -  n12  f12) 


X(II12  ril  "  II11  ri2 ) 


(C-16) 


obtained  from  the  second  equation  to  substitute  into  the  first, 
a  quadratic  equation  in  X, 

aX2+bX+c«0  ,  (C-17 ) 

results.  After  some  tedious  but  straightforward  algebra  it  will 
be  found  that  the  coefficients  of  (C-17)  are  given  by 

a  *  nn  (ii-jjl  n22  -  n12  +  rn  r22  -  r12  +  2  n12  r12  -  u1±  r22  -  rn  n22)  , 
b  -  (nu  r22  +  rn  n22  -  2  n12  r12  +  2  ii22  -  2  nn  ii22)  ,  (c-18) 

c  -  Hu  (Hii  n22  -  n22)  . 


In  the  definition  of  a,  b,  c  by  (C-l8)  the  coipmon  factor  11^ 
can  be  omitted  since  it  has  no  effect  on  the  solution  of  (C-17). 


With  the  aid  of  (C-2)  and  (C-ll)  the  II^j  and  can  be 
calculated  explicitly  from  (C-15).  The  results  are 


II 


11 


00 ' 


II 


12 


11 


-  N 


00 


0,  XI22  «10  «la  MJ0  -  SS  M00  , 

U 

°T  (S-M00>-  P22  '  -  (Moo+^>  <S-Moo>2 


( c— 19 ) 


where  S  is  given  by  the  last  equation  in  (C-ll). 


C-7 
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It  is  of  some  interest  to  obtain,  explicitly,  the  discrim¬ 
inant  d  of  the  quadratic  equation  (C-17)  after  removal  of  the 
common  factor  11^.  The  result,  calculated  from  (C-18),  is 

d  -  b2-4ac  -  (nn  r22  +  rn  n22)2  +  4  nn  n22  (r^  -  rn  r22)  .  (c-20) 


Prom  the  definitions  (C-15)  and  the  fact  that  M_  is  a  posi- 
tive  definite  matrix  it  follows  that  II ^  and  II22  are  both 
positive.  Thus,  if  r12  on  the  right  side  of  (C-20)  is  replaced 
by  zero  the  effect  will  be  to  decrease  the  right  side  of  the 
equation.  That  is. 


4  (II11  r22  +  ril  II22) 


-  4  “ll  IX22 


11  22 


(II 


11 


r22“ril  n22)2i°-  (C‘21) 


In  other  words,  according  to  (C-21)  the  discriminant  is  always 
non-negative.  Therefore,  the  quadratic  equation  (C-17)  for  the 
eigenvalues  X  has  only  real  roots,  which  is  certainly  a  require¬ 
ment.  In  fact,  a  fortiori,  since  MG  and  |JT  are  both  positive 
definite  the  equation  (C-3)  can  only  be  satisfied  for  positive 
real  values  of  X. 

A  reference  to  the  form  of  (C-20)  and  of  (C-21)  indicates 
that  the  discriminant  vanishes;  i.e.,  the  roots  of  (C-17)  will 
be  equal,  only  if  I*12  is  zero  and  11^  ^22  "  ril  II22*  or 
II22  and  F22  are  both  zero.  A  reference  to  (C-19)  confirms 
that  the  first  pair  of  conditions  either  Imply  the  second  pair, 
which  are  satisfied  only  when 

S  -  M  (C-22) 

00 

and 


£  Mio  "jo  ■ 

u 


C-8 
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r 


1 


f 

*  r 


or  else  they  imply  that 


M. 


oo 


( C— 2  3 ) 


and 


M 


oo 


T 


Thus,  the  eigenvalues  will  be  equal  if  and  only  if  (C-22)  or 
(C-23)  is  satisfied. 

If  the  eigenvalues  are  equal  the  corresponding  independent 
eigenvectors  are  P  and  Q  defined  by  (C-ll).  If  the  eigenvalues 
are  not  equal  their  corresponding  eigenvectors  are  given  by 
(C-12)  with  the  respective  values  of  o  given  by  (C-l6). 

2 

The  n  -2  linearly  independent  vectors  Y.  that  were  obtained 
from  the  V.  defined  by  (C-6)  are  eigenvectors  satisfying  the 
equation  (C-3),  all  corresponding  to  the  eigenvalue  Xm  *  1. 

They  are  also  mutually  orthogonal  with  respect  to  the  usual 
inner  product  defined  by  (C-8). 

The  set  of  all  eigenvector  solutions  of  (C-3)  consists  of 
the  Y.  and  the  two  additional  vectors  U  and  V  that  are  linear 

a* 

combinations  of  j£  and  jg,  whose  components  are  defined  by  (C-ll). 
However,  in  order  to  use  these  eigenvectors  to  construct  a  prin¬ 
cipal  axis  transformation  that  simultaneously  diagonalizes  JJJC 
and  rjj,,  which  was  the  original  purpose  of  the  analysis  in  this 
appendix,  a  further  step  is  necessary.  The  eigenvectors  must 
be  mutually  orthogonal  with  respect  to  the  inner  product  de¬ 
fined  by  (C-9). 

This  condition  is  satisfied  by  U,  V  and  ^ny  one  of  the  Y. , 

0*0  0*0  X 

because  they  -correspond  to  different  eigenvalues,  but  it  is 

2 

not  necessarily  satisfied  by  the  Y^  However,  n  -2  linear 
combinations  of  the  Y1  can  be  found  that  are  mutually  ortho¬ 
gonal  with  respect  to  the  inner  product  defined  by  CC-9). 

C-9 


i 


The  Gram-Schmidt  orthogonalization  process,  e.g.,  in  the  form 
of  the  algorithm  given  by  the  recursion  relations  (C-7),  will 
accomplish  this  objective  when  it  is  applied  to  the  Y^  using 
(C-9)  instead  of  (C-8). 

The  vectors  U,  V  and  the  resulting  linear  combinations  of 
the  Y.  are  then  the  column  vectors  of  a  matrix  T  which  provides 
the  desired  principal  axis  transformation.  That  is, 

Tfc  Mr  T  and  M_,  T 

will  both  be  diagonal  matrices,  as  required. 
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RATIOS  OF  MULTI-VARIATE  GAUSSIAN  DISTRIBUTED  VARIABLES 

1 .  Probability  Density  Functions 

Reference  D-l  contains  a  derivation  of  the  joint  probability 
density  of  the  ratios 


when  J,  ,  J^.  J-,  have  a  tri-variate  Gaussian  distribution.  In 
this  appendix  the  derivation  will  be  generalized  to  cover  the 
case  of  variables  with  an  N-variate  Gaussian  distri¬ 

bution. 

That  is,  it  will  be  assumed  that  there  is  a  joint  proba¬ 
bility  density  function  given  by 


P(J) 


-  i  (j-j)t  m”1 ( j-j ) 
£  <»  —  »  ~  — 


(D-l) 


where  J  is  an  N-dimensional  vector,  J  is  the  N-dimensional  mean 
vector,  M  is  the  N  by  N  covariance  matrix  relative  to  the  proba' 
bility  distribution  for  J  and  |M|  Is  the  determinant  of  M.  The 

—  '  jg  ~ 

problem  is  to  determine  the  joint  probability  density  function 
P_(X)  for  the  ratios 

ft  ~ 

Ji 

Xi  *  ~J~  *  i*l,...,N-l  , 

which  are  components  of  an  (N-l)-dimensional  vector  X. 


D-l 


The  argument  used  in  Ref.  B-l  for  the  case  of  three  vari¬ 
ables  can  be  extended  to  cover  the  general  case  of  N  variables. 
The  first  step  is  to  define  the  change  of  variables 


Ji  =  UX±,  i-l, . . . ,N-1  , 

JN  *  U  * 


(D-2) 


The  Jacobean  for  the  transformation  (D-2)  is  then  given  by 


U,0,0, . 
0,U,0, . 


3  (J]L, .  .  .  »JN) 

3  (X1, .  .  .  »XN-1j1 


0,0,.  . 


.,o,x1 

.,o,x2 


«  Ul 


(D-3) 


.  0,1 


The  (N-l)-variate  probability  density  function  for  the 
ratios  Xi  is  given  by 


PR<*> 


•/ 


P(J) 


3  ( » •  •  •  » ) 
(X1,...,xN_1,1 


(D-4) 


Substitutions  from  (D-l),  (D-2),  and  (D-3)  into  (D-4)  then 
lead  to  the  result 


pr<*> 


/» 

- 1  „ 

e  |  U  | 


dU  ,  (D-5 ) 


>€-r 


where 


Q(x1,J1,u) 


N 

2  Au  (uV5i>  ("VV  ■ 

i,J=l 


in  *hich  the  coefficients  A.,  are  the  elements  of  the  inverse 


.-1 


covariance  matrix  M  and,  by  definition, 


Y±  =  ,  i*l , . .  .  ,N-1 , 


YN  "  1  * 


Then  a  straightforward  calculation  provides  the  result 


(D-6) 


pR(X) 

A  *** 


AU  +BU 


1 U | N_1  dU 


(D-7) 


where 


A  =  Yfc  M"1  Y,  B  =  M_1  Y,  C  *  J*  M"1  J  , 

/s*  rw 

in  which  Y  is  the  N-dimensional  vector  whose  components  are 
given  by  (D-6) . 

When  N  is  an  odd  number  (D-7)  can  be  written 


Pr<~)  '  <2')N/2/Tir 

mm 

D-3 


However,  when  N  is  even 


Similar  steps  taken  with  (D-8),  for  the  case  in  which  N  is  odd, 
lead  to  the  result 


0) 


With  the  aid  of  the  binomial  expansion  theorem  (D-10)  can  be 
written 


b2-ac 

2A  N-l 

— hi - L 

(2u)‘I'|^pr  r-° 


'n-l\  K-X-r  i(r-H-N) 

>  r  /  B  A  ]iT  , 


f"  Vi 

where  ur  Is  the  r  moment  of  the  standard  normal  probability 
distribution,  i.e.,  a  Gaussian  distribution  with  zero  mean  and 
a  standard  deviation  of  one.  According  to  Ref.  D-2  (p.  208)  the 
moments  of  the  standard  normal  distribution  are  given  by 


U 


o 


v 


r 


0,  for  r  odd 
l*3...(r-l)  for  r  even. 


D-5 


A  comparison  of  the  first  integral  with  (D-10)  shows,  after 
a  little  manipulation,  that  for  N  even  PR(X)  is  given  by  (D-ll) 
plus  a  remainder  term  E(X)  given  by 


B2-AC 


E(X) 


<2itA)n/*  /W 


-  f :  % 

|T  J  V  W 


(D-12 ) 


By  applying  the  binomial  expansion  to  the  integrand  of  (D-12) 
it  is  possible  to  express  E(X)  as  a  finite  linear  combination 
of  incomplete  gamma  functions. 

The  simplest  examples  of  even  and  odd  N  (except  for  the 
trivial  case  of  N*l)  are  N-2  and  N-3.  For  N«3,  which  was  con¬ 
sidered  in  Ref.  D-l ,  (D-ll)  provides  the  earlier  result 


V2P 


b2-ac 


2ir  /IF 


(>4 


For  N«2,  (D-ll)  and  (D-12)  provide  the  result 


B2-AC 


Pn (X)  -  S -  BA  d  +  E(X )  , 

R  ~  /TFfF 


(D-13) 


where 


For  cases  of  practical  interest  C  >>  1  because  the  means 
3^  will  be  many  standard  deviations  away  from  zero.  This  can 
be  seen,  for  example,  in  the  data  of  Ref.  D-3,  for  which  mean 
equivalent  temperatures  in  the  thermal  bands  are  all  of  the 
order  of  300  deg  K  while  the  standard  deviations  are  at  most 
2  or  3  deg  K.  A  similar  observation  can  be  made  for  the  solar 
bands,  although  the  means  at  those  wavelengths  (1  u  -  3  y) 
differ  from  zero  by  amounts  of  the  order  of  10  standard  devia¬ 
tions  rather  than  100. 


g 

If  C  is,  in  fact,  large  and  the  quantity  —  is  not,  the 

</~K 

exponential  factor  in  (D-il)  and  (D-12)  will  guarantee  that 

PR(X)  will  be  negligible  in  general.  On  the  other  hand,  when 

J3  is  comparable  to  C  in  magnitude,  i.e.,  when  _B  >>  1,  it  is 

/A  /A 

evident  from  (D-12)  that  E(X)  will  be  negligible.  Then  (D-ll), 

which  is  exact  when  N  is  odd,  will  also  provide  a  good  approxi¬ 
mation  to  PD(X)  when  N  is  even. 

n 

2.  Calculation  of  False  Alarm  Probabilities  for  Two-Color 
Systems 

As  observed  in  Chapter  IV,  for  a  two-color  system  the 
mathematical  model  proposed  in  this  paper  implies  that  in  data 
space  the  decision  regions  determined  by  an  optimum  two-dimen¬ 
sional  or  ratio  discrimination  rule  will  always  be  bounded  by 
straight  lines.  In  fact,  for  a  two-dimensional  rule  the  regions 


will  be  half  planes,  whereas  for  a  ratio  rule  they  will  consist 
of  one  or  more  triangles  or  angular  sectors.* 


According  to  (3)  in  Chapter  II,  the  probability  of  false 
alarm  is  given  by 


PPA  •  ~  j  f  [-  1  (i-ic)C  S1 

ftC  R 

where  R  is  the  region  in  which  a  point  corresponds  to  a  target 
detection  as  defined  by  the  discrimination  rule.  To  evaluate 
the  integral  in  (D-15)  it  is  convenient,  first,  to  translate 
the  coordinate  system  so  that  the  clijtter  mean  ^  is  at  the 
origin  of  the  new  system.  This  is  done  by  setting 


r  -  J-Jp  ,  (D-16) 

whereupon  (D-15)  takes  the  form 


PFA  - 


- 1  -  [  f  exp  [-  y  Q(r )1  dxdy 

J  J  1  2  1 

R’ 


(D-17) 


where 


Q(£)  "  £fc  gc1  £ 


> 


An  angular  sector  may  be  regarded  as  a  triangle  with  one  side 
at  infinity.  For  numerical  purposes  that  side  may  have  any 
convenient  orientation,  and  its  intersections  with  the  other 
two  sides  of  the  triangle  can  be  specified  arbitrarily  as 
long  as  the  cartesian  coordinates  of  the  intersections  and 
coordinate  differences  are  large,  e.g.,  of  the  order  of  1000  o. 


r»r? i IHHW  » ii  an  ■  ■  If  y-fTy,^  __ 


and  R'  has  been  written  in  place  of  R  as  a  reminder  that  the 
analytic  description  of  the  region  R  will  be  different  in  the 
new  coordinate  system.  The  next  step  is  to  change  to  polar 
coordinates;  then  (D-17)  becomes 


PPA  - 


1 

2* '157 
**  W 


/  /" 


rdrd8  , 


(D-18) 


where 


Q(6)  ■  A11  cos20  +  2  A12  sin8  cose  +  A22  sin20  .  (D-19) 

In  (D-19)  the  coefficients  A.,  are  elements  of  the  inverse 

—  1 

covariance  matrix  Jgc  given,  in  terms  of  the  standard  devia¬ 
tions  o1#  o 2  and  the  correlation  coefficient  p  for  clutter 
statistics,  by 


11 


.  o  2  9 
(1-P2) 


22  (1-p2)  a  2  *  "12 


(1-p  )  0^2 


(D-20) 


For  a  triangular  region  Rr  (D-18)  is  a  sum  of  three  terms, 
one  for  each  side,  of  the  form 


i  r  ei+i  f  ri(0)  r  1  2  1 

pi  “  - ~ — -  /  j  exp  -  i  r  Q(0)  rdrde  , 

2.  f\ p-  J  J  l  2  J 


(D-21) 


where  6^  and  0i+l  are  the  angular  coordinates  of  the  end-points 
of  the  side  i  and  the  equation  of  the  line  of  which  the  side  is 
a  segment  is  given  in  polar  coordinates  by 

r  ■  ri(9>  ■  HrtH :  m-'co»S  '  (D-22) 


D-9 


In  (D-22)  is  the  y-  intercept  and  is  the  slope  of  the  line. 
It  does  not  matter  whether  the  origin  of  the  coordinate  system 
is  inside  or  outside  of  the  triangle  as  long  as  the  integration 
over  the  intervals  from  0i  to  proceeds  around  the  triangle 

in  a  counter-clockwise  direction. 

In  (D-21)  the  integral  over  r  can  be  evaluated  explicitly. 
The  result  is  a  single  Integral;  in  fact. 


where  r1(0)  is  given  by  (D-22)  and  Q(0)  by  (D-19). 

For  the  2D  rule  the  regions  R  and  R'  are  half-planes.  An 
analysis  similar  to  that  used  in  deriving  (D-23)  leads,  in  this 
case,  to  the  result 


where 


(D-25) 


In  (D-25)  b  is  the  y-  intercept  and  ra  is  the  slope  of  the  line 
that  separates  the  target  from  the  background  data  points 
according  to  the  2D  rule.  The  intercept  b  in  (D-25)  is  defined 
in  terms  of  the  coordinate  system  centered  at  the  mean  Jq. 

D-10  • 


The  formula  (D-24) ,  for  the  case  of  a  2D  rule.  Is  In  terms 
of  a  single  Integral  that  can  be  evaluated  numerically  without 
difficulty.  For  the  case  of  a  ratio  rule  the  false  alarm 
probability  is  given  by 

PFA  -  +  p2  +  P3  *  (D-26) 

wherein  each  term  is  given  by  a  formula  of  the  type  depicted 
in  (D-23).  A  straightforward  numerical  integration  will  also 
lead  to  the  value  of  each  term  in  (D-26). 
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